Extract int from string in dataframe

Hello -

I have a dataframe as below:

How do I extract only integer oz value as int in a separate column:

For eg:

5.3 from 5.3 oz (151 g)
22 from 22 fl oz cup

Dataframe:
5.3 oz (151 g)
5 oz (142 g)
3.4 oz (97 g)
3.5 oz (98 g)
14.2 oz (403 g)
22 fl oz cup
13.4 oz (381 g)
4 oz (114 g)
2.3 oz (65 g)
2 oz (56 g)
7.9 oz (223 g)
16 fl oz cup
5.9 oz (168 g)
20 fl oz cup
20 fl oz cup
10.1 oz (285 g)
20 fl oz cup
20 fl oz cup
22 fl oz cup
22 fl oz cup
22 fl oz cup
16.2 oz (460 g)
22 fl oz cup
22 fl oz cup
20 fl oz cup
20 fl oz cup
20 fl oz cup
20 fl oz cup

Vijay

Hi @VijayVignesh,

  1. Are you trying to retrieve the very first float value? Or the integer version of it? ex: 5.3 is float, but the integer would have been 5 here.
  2. To do so, you can either traverse the complete dataframe and use .split() on each string.
    Another approach for this can be:
df.apply(lambda i: i.split(' ')[0])

Yes I needed the float, and hence I applied float upon str.split () function and was able to manage to get the output. However, yours looks to be a compact code and is a better coding practice. Thank you for the suggestion.

Cheers,
Vijay

1 Like

A more efficient approach can be using regular expressions:

df.<column name>.str.findall(r'\d{1,2}[.]\d{1,2}')

@Ishvinder what say?

1 Like