python - How would I collapse rows based on a value in a column? -


i’ll describe mean in more detail here. suppose have data sheet looks this:

+-----------+---------+---------+---------+---------+---------+---------+--------------+ |           | person1 | person2 | person4 | person4 | person5 | person6 |     city     | +-----------+---------+---------+---------+---------+---------+---------+--------------+ | january   | -       |       - | yes     |       - | yes     | -       | sanfrancisco | | febuary   | yes     |       - | -       |       - | -       | -       | sanfrancisco | | march     | -       |       - | -       |       - | -       | -       | sanfrancisco | | april     | -       |       - | -       |       - | -       | -       | newyork      | | may       | yes     |       - | -       |       - | -       | -       | newyork      | | june      | -       |       - | -       |       - | -       | -       | newyork      | | july      | -       |       - | -       |       - | yes     | -       | newyork      | | august    | -       |       - | -       |       - | -       | -       | newyork      | | september | -       |       - | -       |       - | -       | -       | miami        | | november  | -       |       - | -       |       - | -       | yes     | miami        | | december  | -       |       - | -       |       - | -       | -       | miami        | +-----------+---------+---------+---------+---------+---------+---------+--------------+ 

ignoring ascii stackoverflow formatting, it’s simple spreadsheet tracks 6 people based on city they’ve been in months.

what want know is, people have visited cities. condensing list this:

+---------+---------+---------+---------+---------+---------+--------------+ | person1 | person2 | person4 | person4 | person5 | person6 | city         | +---------+---------+---------+---------+---------+---------+--------------+ | yes     | -       | yes     | -       | yes     | -       | sanfrancisco | | yes     | -       | -       | -       | yes     | -       | newyork      | | -       | -       | -       | -       | -       | yes     | miami        | +---------+---------+---------+---------+---------+---------+--------------+ 

each row 1 city, , contains people have visited it. there optimum way this, or rather, there sort of tr(squeeze)/sed tool this? if had code this, optimum logic be?

$ cat tst.awk function prt() {     if ( prev != "" ) {         (i=2;i<=nf;i++) {             printf "%s%s", vals[i], (i<nf ? ofs : ors)         }     }     delete vals }  begin { fs=ofs="," } $nf != prev { prt() } {     (i=1;i<=nf;i++) {         vals[i] = (vals[i] ~ /[[:alpha:]]/ ? vals[i] : $i)     }     prev = $nf } end { prt() }  $ awk -f tst.awk file person1,person2,person4,person4,person5,person6,city yes,-,yes,-,yes,-,sanfrancisco yes,-,-,-,yes,-,newyork -,-,-,-,-,yes,miami 

the above assumes input format csv this:

$ cat file month,person1,person2,person4,person4,person5,person6,city january,-,-,yes,-,yes,-,sanfrancisco febuary,yes,-,-,-,-,-,sanfrancisco march,-,-,-,-,-,-,sanfrancisco april,-,-,-,-,-,-,newyork may,yes,-,-,-,-,-,newyork june,-,-,-,-,-,-,newyork july,-,-,-,-,yes,-,newyork august,-,-,-,-,-,-,newyork september,-,-,-,-,-,-,miami november,-,-,-,-,-,yes,miami december,-,-,-,-,-,-,miami 

and want csv output.


Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

c# - Json.Net Serialize String from URI -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -