Extract numeric characters plus one character...

  • Thread starter Thread starter KLZA
  • Start date Start date
K

KLZA

I have the following text on a column's row where I need to extract
only the numeric characters plus one. ABCDEFG 10M ABCDEFG. I'd like
to extract only the 10M (numeric values plus one character on the
right) for each item. I'm dealing with thousands of rows of similar
data but I'm not sue how to achieve this. Can anyone help?
 
Try this Arrary function ( use ctrl + shift + enter )

=MID(A1,MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$30),1)),0),COUNT(1*MID
(A1,ROW($1:$30),1),1))
 
Is your data always the same "shape" (that is, 7 characters followed by a
space followed by the 3 characters you want followed by a space and the rest
of the text)? If so...

=MID(A1,9,3)

If the amount of characters in front of what you want is not a fixed number,
are the numbers in your data always the first numbers in the cell (as shown
in your sample)? If so...

=MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),3)
 
Try this Arrary function ( use ctrl + shift + enter )

=MID(A1,MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$30),1)),0),COUNT(1*MID
(A1,ROW($1:$30),1),1))



- Show quoted text -

Hi. That didn't work.
 
Another possibility - if the text you want is always preceded by a space and
that is the first space in the text, then you can use this...

=MID(A1,FIND(" ",A1&" ")+1,3)
 
I have the following text on a column's row where I need to extract
only the numeric characters plus one. ABCDEFG 10M ABCDEFG. I'd like
to extract only the 10M (numeric values plus one character on the
right) for each item. I'm dealing with thousands of rows of similar
data but I'm not sue how to achieve this. Can anyone help?

Is the part you wish to extract always the "next-to-last" word?

If so then:

=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",198)),99,198))

IF not, post some more examples with more variability.
--ron
 
Is the part you wish to extract always the "next-to-last" word?

If so then:

=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",198)),99,198))

IF not, post some more examples with more variability.
--ron

Thanks. The data different lengths of txt before the numeric value
and after the value. the numerics can be 1-5 digits long i only need
to capture the first character after the numeric values. there is no
other logic with spacing etc...
 
KLZA said:
Hi. That didn't work.


Actually, with "ABCDEFG 10M ABCDEFG" in A1, it results in "10M", which I believe
was your desired result. So, how exactly didn't it work?
 
Actually, with "ABCDEFG 10M ABCDEFG" in A1, it results in "10M", which I believe
was your desired result.  So, how exactly didn't it work?- Hide quoted text -

- Show quoted text -

Hi Thanks for the help. I need to capture only the numeric characters
(any length) preceded by alphas (any length) and only the first alpha
after the length of numbers. My data ccould be TTTTT10MTTTTT or
TTT1000MTTT or TTTTTTT1MTTTTTTTTTT etc..
 
Try this formula...

=MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),LEN(LOOKUP(9E+307,--LEFT(MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),999),ROW(1:999))))+1)

--
Rick (MVP - Excel)


Actually, with "ABCDEFG 10M ABCDEFG" in A1, it results in "10M", which I
believe
was your desired result. So, how exactly didn't it work?- Hide quoted
text -

- Show quoted text -

Hi Thanks for the help. I need to capture only the numeric characters
(any length) preceded by alphas (any length) and only the first alpha
after the length of numbers. My data ccould be TTTTT10MTTTTT or
TTT1000MTTT or TTTTTTT1MTTTTTTTTTT etc..
 
Here's a VBA example, not as neat as a worksheet function but here ya go...

Cells A1:I1
TTTT100TT
cell J1 enter =GetNumAndChar(A1:I1)

Dragging the lower right corner of this cell to other locations create the
referrenced incremented ranges.

HTH

<begin copy omit this line>
Option Explicit

Public Function GetNumAndChar(ByVal rRange As Excel.Range) As String

Dim in_value As String
Dim iRow, iCol, iposit, iStop As Integer
in_value = ""
iRow = 1

For iCol = 1 To rRange.Cells.Count
If rRange.Cells(iRow, iCol) = "." Then iCol = iCol + 1
If IsNumeric(rRange.Cells(iRow, iCol)) Then
in_value = in_value & rRange.Cells(iRow, iCol)
If IsNumeric(rRange.Cells(iRow, iCol)) And Not
IsNumeric(rRange.Cells(iRow, iCol + 1)) Then
in_value = in_value & rRange.Cells(iRow, iCol + 1)
Exit For
End If
End If
Next

If IsNull(in_value) Then
GetNumAndChar = " "
Else: GetNumAndChar = in_value
End If

End Function
<end copy omit this line>
 
Thanks. The data different lengths of txt before the numeric value
and after the value. the numerics can be 1-5 digits long i only need
to capture the first character after the numeric values. there is no
other logic with spacing etc...

Easy to do with a UDF:

To enter this <alt-F11> opens the VB Editor. Ensure your project is
highlighted in the Project Explorer window, then Insert/Module and paste the
code below into the window that opens.

To use this, enter a formula of the type:

=extrNumsPlusOne(A1)

The pattern (in the UDF below) will find the first series of digits and a
following single alpha character.

==============================
Function extrNumsPlusOne(str As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.Pattern = "\d+[A-Za-z]"
If re.test(str) = True Then
Set mc = re.Execute(str)
extrNumsPlusOne = mc(0).Value
End If
End Function
============================
--ron
 
Just so you know, in this statement from you code...
Dim iRow, iCol, iposit, iStop As Integer

only iStop is declared as an Integer... iRow, iCol and iposit are all
declared as Variants. In VB, you must declare each variable individually as
to its Type. Also, in the current 32-bit world of computing, there is no
real advantage to declaring a variable as Integer rather than Long... an
Integer will take up the memory space of a Long when stored there. So,
combining these two thoughts, you could do this...

Dim iRow As Long, iCol As Long, iposit As Long, iStop As Long

or this

Dim iRow As Long
Dim iCol As Long
Dim iposit As Long
Dim iStop As Long

your choice.
 
Extracting numeric data from rows

You have rows in file (assume X.dat) that have the following form.

TTTTT10MTTTTT
TTT1000MTTT

TTTTTTT1MTTTTTTTTTT
etc..

You need to extract the numeric form and one char to the right. The output will be

10M
1000M
1M

Cool.

Use the following script in biterscripting.

var str data ; cat X.dat > $data
# One row at a time
while ($data <> "")
do
var str row ; lex -e "1" $data > $row
# Remove everything till the first numeric character.
stex -r "]^(0123456789)^" $row > null

# Remove everything after the last numeric character.
# But save it, we will need to get the first character from it.
var str after_str ; stex -r "^(0123456789)^l[" $row > $after_str

# $row now has numeric portion. Add the first char from $after_str
# Then, output.
set $row = $row + { chex "1" $after_str } ; echo $row
done

Copy and paste this script (I have tested it on your data) in file C:/something/extract.txt . Put the X.dat in the same directory. Start biterscripting Interactive. Type the following commands.

cd "C:/something"
script extract.txt

That should do it. I have tested it. If you don't have biterscripting, download it free from, I think, http://www.biterscripting.com .

Patrick
 
Back
Top