知识问答

awk 把多行变成一行（对文本块处理）

今天处理阿里云的订单流水遇到这样一个场景： 202087290320221 增量带宽订单 2018-06-0322:14:21 ¥222.02 ¥222.02 202085390830221 云服务器ECS(包月) 订单 2018-06-0322:07:01 ¥...

今天处理阿里云的订单流水遇到这样一个场景： cat eee.txt

202087290320221

增量带宽

订单

2018-06-03 22:14:21

¥222.02

202085390830221

云服务器ECS(包月)

订单

2018-06-03 22:07:01

¥2552.00

202088490000221

云服务器ECS(包月)

订单

2018-06-03 22:04:39

¥556.00

通过处理，要变成如下格式：

202087290320221|增量带宽|订单|2018-06-03 22:14:21|¥222.02|¥222.02

202085390830221|云服务器ECS(包月)|订单|2018-06-03 22:07:01|¥2552.00|¥2552.00

202088490000221|云服务器ECS(包月)|订单|2018-06-03 22:04:39|¥556.00|¥556.00

上述的数据格式，其实还是蛮有规律的，每个块包含六行数据，中间一个空行，然后是另外一个六行的数据块，依次类推

通过awk，我们只要改变awk的RS分隔符就可以实现，RS默认分隔符为"\n",我们可以将它换成 RS="" ,RS为空也即意味着使用空白行来分隔一行

RS=“” 的解析如下：

RS == "\n"

Records are separated by the newline character (`\n'). In effect, every line in the data file is a separate record, including blank lines. This is the default.

RS == any single character

Records are separated by each occurrence of the character. Multiple successive occurrences delimit empty records.

RS == ""

Records are separated by runs of blank lines. The newline character always serves as a field separator, in addition to whatever value FS may have. Leading and trailing newlines in a file are ignored.

RS == regexp

Records are separated by occurrences of characters that match regexp. Leading and trailing matches of regexp delimit empty records. (This is a gawk extension; it is not specified by the POSIX standard.)

RS="" 确定后，同样也将FS分隔符替换下，因为，FS默认是空白字符来分隔“列” ，上面的例子中，时间列中间有空白，需要把它们放到一列，所以FS="\n", 把原始数据的一行作为一列。

原始数据一个数据块有六行，所以我么可以使用下面命令来格式化：

awk 'BEGIN{FS="\n";RS="";OFS="|"}{print $1,$2,$3,$4,$5,$6}' eee.txt

对于上面文本，上面命令已经达到我们目的，但上面命令不是通用的适配，比如下面文本：cat fff.txt

huanxgin

XIAN

711711

HANGZHOU

399229

chianzhonggua dddo

fdfdsf

Shanghai

888912

这些文本，每一个“块” 的行数是可变的，处理这类问题，我们通过循环语句把每个块的列给遍历出来，然后打印

awk 'BEGIN{FS="\n";RS="";ORS="|"} { for(x=1;x<NF;x++) { print $x "\t"} print $NF "\n"}' fff.txt |sed 's/^|//g'

x<NF ，的情况下，循环打印到 $(NF-1)列，最后把 $NF 附加到后面并换行。然后进入打印下一个RS="" 行。

发表于 2018-06-20 17:54
阅读 ( 479 )
分类：系统管理

awk 把多行变成一行（对文本块处理）

你可能感兴趣的文章

相关问题

0 条评论

作家榜 »