text processing - sed to copy part of line to end -


i'm trying copy part of line append end:

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/gca/900/169/985/gca_900169985.1_ionxpress_024_genomic.fna.gz 

becomes:

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/gca/900/169/985/gca_900169985.1/gca_900169985_ionxpress_024_genomic.fna.gz 

i have tried:

sed 's/\(.*(gca_\)\(.*\))/\1\2\2)' 

$ f1=$'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/gca/900/169/985/gca_900169985.1_ionxpress_024_genomic.fna.gz'  $ echo "$f1" ftp://ftp.ncbi.nlm.nih.gov/genomes/all/gca/900/169/985/gca_900169985.1_ionxpress_024_genomic.fna.gz  $ sed -e 's/(.*)(gca_.[^.]*)(.[^_]*)(.*)/\1\2\3\/\2\4/' <<<"$f1" ftp://ftp.ncbi.nlm.nih.gov/genomes/all/gca/900/169/985/gca_900169985.1/gca_900169985_ionxpress_024_genomic.fna.gz 

sed -e (or -r in systems) enables extended regex support in sed , don't need escape group parenthesis ( ).

the format (gca_.[^.]*) equals "get gca_ chars , excluding first found dot" :

$ sed -e 's/(.*)(gca_.[^.]*)(.[^_]*)(.*)/\2/' <<<"$f1" gca_900169985 

similarly (.[^_]*) means chars first found _ (excluding _ char). regex way perform non greedy/lazy capture (in perl regex have been written .*_?)

$ sed -e 's/(.*)(gca_.[^.]*)(.[^_]*)(.*)/\3/' <<<"$f1" .1 

Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -