Export CSV dash is \226 (unicode)

  • Thread starter Thread starter wjr
  • Start date Start date
W

wjr

Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:

| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|

How to strip this out?
 
You want to get rid of all the "pipes" that de-limiting the text?

Pipe character is found above Enter key Shift\

Edit>Replace

What: Alt + 0124 ( enter digits via Numpad)

With: nothing


Gord Dibben MS Excel MVP
 
Gord said:
You want to get rid of all the "pipes" that de-limiting the text?

Pipe character is found above Enter key Shift\

Edit>Replace

What: Alt + 0124 ( enter digits via Numpad)

With: nothing


Gord Dibben MS Excel MVP

Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:

| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|

How to strip this out?
Not the problem at all. I was showing the ascii table for those who
don't know it.

Start from here in case you are confused.
Here is a real piece of data saved from excel as csv file. NOTE: it's
part of the first field, so don't get confused by the lack of a ','
character.
AS SEEN in EXCEL:
Animal Rights - General

AS SAVED TO CSV:
Animal Rights û General

Here is an octal dump of the saved part:
od -bc x
0000000 101 156 151 155 141 154 040 122 151 147 150 164 163 040 226 040
A n i m a l R i g h t s 226
0000020 107 145 156 145 162 141 154 040 040 040 012
G e n e r a l \n
0000033

As you can very clearly see from this, the - dash charcter is 226 and
not a proper -.

Here is what I should be getting:
0000000 A n i m a l R i g h t s -
101 156 151 155 141 154 040 122 151 147 150 164 163 040 055 040
0000020 G e n e r a l \n
107 145 156 145 162 141 154 012
0000030
 
I sure missed this one!

I don't speak Octal<g>


Gord

Gord said:
You want to get rid of all the "pipes" that de-limiting the text?

Pipe character is found above Enter key Shift\

Edit>Replace

What: Alt + 0124 ( enter digits via Numpad)

With: nothing


Gord Dibben MS Excel MVP

Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:

| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|

How to strip this out?
Not the problem at all. I was showing the ascii table for those who
don't know it.

Start from here in case you are confused.
Here is a real piece of data saved from excel as csv file. NOTE: it's
part of the first field, so don't get confused by the lack of a ','
character.
AS SEEN in EXCEL:
Animal Rights - General

AS SAVED TO CSV:
Animal Rights û General

Here is an octal dump of the saved part:
od -bc x
0000000 101 156 151 155 141 154 040 122 151 147 150 164 163 040 226 040
A n i m a l R i g h t s 226
0000020 107 145 156 145 162 141 154 040 040 040 012
G e n e r a l \n
0000033

As you can very clearly see from this, the - dash charcter is 226 and
not a proper -.

Here is what I should be getting:
0000000 A n i m a l R i g h t s -
101 156 151 155 141 154 040 122 151 147 150 164 163 040 055 040
0000020 G e n e r a l \n
107 145 156 145 162 141 154 012
0000030
 
Back
Top