how to split a text into an array of words then rebuild the text and keep the ponctuation?

  • Thread starter Thread starter Pascal
  • Start date Start date
P

Pascal

How to store all the words of a text in a table, modify some of them, then
reconstruct the text by retaining the original punctuation?

I tried this:
words = TextBox1.Text.Split(New String() {" "c, ","c, "."c, ":"c,
";"c, "?"c, "!"c, "'"c, "`"c, "-"c}, StringSplitOptions.RemoveEmptyEntries)

then :
Dim WordsSpan = words.Clone
Dim word As String
For Each word In words
word = ("<span class=""blue"" onclick=""verif(this)"">" + word +
"</span>")
TextBox2.Text += word
Next

so the text
" Les poules du couvent couvent l'obscurité la nuit.

Pourquoi? moi, eux-mêmes!
radis noirs: sérieux"

becomes:

<span class="blue" onclick="verif(this)">Les</span><span class="blue"
onclick="verif(this)">poules</span><span class="blue"
onclick="verif(this)">du</span><span class="blue"
onclick="verif(this)">couvent</span><span class="blue"
onclick="verif(this)">couvent</span><span class="blue"
onclick="verif(this)">l</span><span class="blue"
onclick="verif(this)">obscurité</span><span class="blue"
onclick="verif(this)">la</span><span class="blue"
onclick="verif(this)">nuit</span><span class="blue" onclick="verif(this)">

Pourquoi</span><span class="blue" onclick="verif(this)">moi</span><span
class="blue" onclick="verif(this)">eux</span><span class="blue"
onclick="verif(this)">mêmes</span><span class="blue" onclick="verif(this)">
radis</span><span class="blue" onclick="verif(this)">noirs</span><span
class="blue" onclick="verif(this)">sérieux</span>


where all the ponctuation disappears!

any idea please
thanks
 
thanks
I read the article, but I am not sure I keep back the ponctuation I loosed
with the split method ?

I think i don't use the right code to do the job I'd like...(describe in
precedent post) add span tag to each word and keep the ponctuation. Perhaps
there is another way to explode the text in its words?
thanks
pascal
 
How to store all the words of a text in a table, modify some of them,
then reconstruct the text by retaining the original punctuation?

I tried this:
words = TextBox1.Text.Split(New String() {" "c, ","c, "."c, ":"c, ";"c,
"?"c, "!"c, "'"c, "`"c, "-"c}, StringSplitOptions.RemoveEmptyEntries)

then :
Dim WordsSpan = words.Clone
Dim word As String
For Each word In words
word = ("<span class=""blue"" onclick=""verif(this)"">" + word + "</span>")
TextBox2.Text += word
Next

so the text
" Les poules du couvent couvent l'obscurité la nuit.

Pourquoi? moi, eux-mêmes!
radis noirs: sérieux"

becomes:

<span class="blue" onclick="verif(this)">Les</span><span class="blue"
onclick="verif(this)">poules</span><span class="blue"
onclick="verif(this)">du</span><span class="blue"
onclick="verif(this)">couvent</span><span class="blue"
onclick="verif(this)">couvent</span><span class="blue"
onclick="verif(this)">l</span><span class="blue"
onclick="verif(this)">obscurité</span><span class="blue"
onclick="verif(this)">la</span><span class="blue"
onclick="verif(this)">nuit</span><span class="blue" onclick="verif(this)">

Pourquoi</span><span class="blue" onclick="verif(this)">moi</span><span
class="blue" onclick="verif(this)">eux</span><span class="blue"
onclick="verif(this)">mêmes</span><span class="blue" onclick="verif(this)">
radis</span><span class="blue" onclick="verif(this)">noirs</span><span
class="blue" onclick="verif(this)">sérieux</span>


where all the ponctuation disappears!

any idea please
thanks

I think for doing this kind of thing, complex splits, that the
System.Text.RegularExpressions.Regex class works much better. Regex is
very powerful, but at first the learning curve seemed steep. If you go
this route, I highly recommend a product called "Expresso", to test
expressions out. It is available as a free download at
http://www.ultrapico.com.
 
Thanks again, I was digging this way these last hours too. and post my new
problem in another thread
'This is what i use in javascript. How i can do this in vbnet?
'txt = txt.replace(/([\wéçâêîôûàèùëï]+)(?=[\s'".,;!?«»\-(\)])/gi,'<spanclass=\'blue\' onclick="verif(this)">$1<\/span>');


Private Sub BtnRegx_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) _
Handles BtnRegx.Click
Dim input As String = TextBox1.Text
Dim pattern As String = "[\w]+ (?=[\s '.,;!?«»\-(\)])"
Dim replacement As String = "<span class=""blue""
onclick=""verif(this)"">" + input + "</span>"
Dim rgx As New Regex(pattern)
Dim result As String = rgx.Replace(input, replacement)
TBoxRegx.Text = result
End Sub
 
Back
Top