讀取檔案:
use strict;
use warnings;
my $filename = 'xxx.txt';
open(my $fh, '<:encoding(UTF-8)', $filename) || die "Could not open file '$filename' $!"; # $fh stand for filehandle
my $count = 0;
while (my $row = <$fh>) {
chomp $row;
print "$row\n";
$count++;
}
close($fh);
( 上述編碼在blogspot用syntax highlight有bug,從HTML轉到撰寫時會變成亂碼 )
http://ind.ntou.edu.tw/~dada/cgi/Perlsynx.htm
$_ The default input and pattern-searching space.
$! Contains the current value of errno.
取代檔案:
http://stackoverflow.com/questions/4732937/how-do-i-read-a-file-line-by-line-while-modifying-lines-as-needed ( 使用Tie::File )
xxx.txt濾出來的內容存成ref ($v)後對bbb.txt一行一行去取代第二個match的城市
use Tie::File; my @file_array; tie @file_array, 'Tie::File', 'bbb.txt' || die "END! $!"; $country = ''; $count = 0; my $city = ''; my $no_match_ref; my $no_match_ref_count = 0; $v = { 'Bolivia' => { 'El Beni' => 'El Beni' }, 'Canada' => { 'Newfoundland' => 'Newfoundland', 'Yukon Territory' => 'Yukon Territory' }, }; for my $line (@file_array) { # s/測試/一二三/g; # Replace PERL with Perl everywhere in the file # country line if( $line =~ /v ===/ ){ my @country_arr= split(/"/, $line); $country = $country_arr[1]; } #city line if( $line =~ /ss\(f,\s\d/ ){ my @city_arr = split(/"/, $line); # print $line,"\n"; $city = $city_arr[1]; # print "country:$country, city:$city, deftag:$v->{$country}{$city}\n"; # - #832 if($v->{$country}{$city}){ my $index = 2; $file_array[$count] =~ s/($city)/--$index == 0 ? $v->{$country}{$city}:$1/ge; } if(!$v->{$country}{$city}){ $no_match_ref->{$country}{$city} = $city; $no_match_ref_count++; } # &use_reg_exp_to_match(); # no use, because I will use SQL } $count++; } print Dumper $no_match_ref; print "\n no_match_ref counter:$no_match_ref_count"; # - #296 untie @file_array;
算ref 的key個數
$v = { 'Bolivia' => { 'El Beni' => 'El Beni' }, 'Canada' => { 'Newfoundland' => 'Newfoundland', 'Yukon Territory' => 'Yukon Territory' }, ... }; $count = 0; foreach my $val ( keys %{$v} ){ #方法一,用foreach跑 $count++; } print "\n$count\n"; print scalar keys $v; #方法二
在sublime用正規式搜尋中文
http://stackoverflow.com/questions/1585914/matching-chinese-characters-with-regular-expressions-php
[\x{4e00}-\x{9fa5}] --> One char between 4E00 and 9FA5
http://www.regular-expressions.info/unicode.html
\p{Han} 是perl的用法 ( 未試驗 )
如何將有改過的檔案檔名做唯一輸出?
工具:cygwin, sublime, ( Komodo Edit 8, perl編輯器 )
1. 將資料夾拉到cygwin上,以直接cd 進入該目錄
2. $ ls -R work* > ls_files.txt
3. 開sublime,將ls_files.txt的檔名濾出來到 all_files.txt另存新檔
4. 寫 filter_template.pl 去開 all_files.txt檔案,讓修改過的樣板為唯一
use strict; use warnings; use Data::Dumper; open(my $fh, "<:encoding data-blogger-escaped-all_files.txt="" data-blogger-escaped-count="" data-blogger-escaped-die="" data-blogger-escaped-exist="" data-blogger-escaped-file="" data-blogger-escaped-my="" data-blogger-escaped-my_array="" data-blogger-escaped-not="" data-blogger-escaped-or="" data-blogger-escaped-row="<$fh" data-blogger-escaped-while="">) { chomp $row; push @my_array, $row; $count++; } close($fh); sub uniq { return keys %{{ map { $_ => 1 } @_ }}; } print join(" ", uniq(@my_array)), "\n";
5. $ perl filter_template.pl > result.txt
result.txt即為結果。讀檔進來的$row尾端會換行(未解決)
參考:
push 值到陣列
http://perl.hcchien.org/ch03.html
push @my_array, $row;
How do I remove duplicate items from an array in Perl?
http://stackoverflow.com/questions/7651/how-do-i-remove-duplicate-items-from-an-array-in-perl
sub uniq { return keys %{{ map { $_ => 1 } @_ }}; } @my_array = ("one","two","three","two","three"); print join(" ", @my_array), "\n"; print join(" ", uniq(@my_array)), "\n";
沒有留言:
張貼留言