I ran into this problem during a data migration when a colleague handed me a CSV full of product codes like “SKU-4829-A” and “ORD-1034-B” — and I needed to pull out just the numbers for validation. Manual extraction was not happening, so I wrote a small Python script to do it automatically. That script is what this article is about.
This article covers two ways to extract digits from a Python string — using isdigit() and using regex. By the end, you’ll have a clear picture of when each approach makes sense, with working examples you can copy-paste into your own code.
TLDR
- Use
str.isdigit()to check each character — works for simple, predictable strings. The isdigit() method is covered in detail on AskPython. - Use
re.findall(r'\d+', s)to extract digit groups in one line — handles complex patterns. - Combine
filter()withisdigit()for a functional-style alternative. - List comprehension is faster than a for loop for simple character-by-character checks.
- Regex is more powerful but adds an import — only reach for it when you need pattern matching.
What is extracting digits from a string in Python?
Extracting digits from a string means pulling out numeric characters (0 through 9) and ignoring everything else. For example, from “Order-1042-abc” you would get the number 1042. Python gives you two direct tools for this: str.isdigit() to test individual characters, and the re module to match digit patterns across the whole string.
Method 1: Using str.isdigit()
The isdigit() method is a built-in string method that returns True if every character in the string is a digit. When you call it on a single character, it tells you whether that character is 0 through 9. This makes it useful for scanning a string character by character and collecting the ones that pass the test. If you need more complex pattern matching across the full string, Python’s regular expression module is the alternative.
The simplest version iterates over the string and builds a new string from matching characters:
inp_str = "SKU-4829-A"
result = ""
for char in inp_str:
if char.isdigit():
result += char
print(result)
The loop walks through each character, checks if it is a digit, and appends it to the result string. For small scripts this is perfectly readable.
You can make this more compact with a list comprehension, which is faster for larger strings because it avoids repeated string concatenation:
inp_str = "SKU-4829-A"
result = "".join(char for char in inp_str if char.isdigit())
print(result)
Using "".join() with a generator expression is the Pythonic way to build strings character by character. It avoids the overhead of creating intermediate string objects that concatenation would produce.
If you need the digits as integers rather than a concatenated string, you can convert them at the end:
inp_str = "Order-1034-spent-500"
digits = [int(char) for char in inp_str if char.isdigit()]
print(digits)
That gives you a list of individual integer digits. If you want them as a single number, join and convert:
inp_str = "Order-1034-spent-500"
number = int("".join(char for char in inp_str if char.isdigit()))
print(number)
print(type(number))
One thing to watch: isdigit() returns True for Unicode digit characters like the fullwidth digit “2” (U+FF12) in addition to plain ASCII digits. If you are processing user input or data from external sources, this may or may not be what you want. For ASCII-only digits, you can check '0' <= char <= '9' instead.
Method 2: Using Regex (re.findall)
Python’s regular expression module handles digit extraction with the pattern \d+, which matches one or more consecutive digit characters. The isdigit() string method covered earlier checks individual characters — regex takes a different approach by matching patterns across the full string. The function re.findall() returns all non-overlapping matches as a list of strings.
import re
inp_str = "Order-1034-spent-500"
result = re.findall(r'\d+', inp_str)
print(result)
Notice the difference from the isdigit() approach: regex groups consecutive digits together. From “Order-1034-spent-500”, regex returns two match objects — “1034” and “500” — rather than individual characters. This is usually what you want when processing formatted strings like IDs, codes, and delimited data.
To get the numbers as actual integers instead of strings, map the conversion:
import re
inp_str = "SKU-4829-A, QTY-12, DISC-99"
numbers = [int(n) for n in re.findall(r'\d+', inp_str)]
print(numbers)
print(sum(numbers))
The regex approach scales better to more complex patterns. If you only want numbers with exactly four digits, you can use \d{4} as the pattern. If you want to match only numbers that appear after a specific prefix, you can use a lookbehind like (?<=SKU-)\d+.
Bonus: Using filter() with isdigit()
Python’s built-in filter() function offers a functional-style way to extract digits. The filter() function takes a predicate and an iterable, returning only the elements where the predicate returns True:
inp_str = "Phone: 987-654-3210"
digits = "".join(filter(str.isdigit, inp_str))
print(digits)
The key advantage of filter(str.isdigit, inp_str) is that str.isdigit is passed as a bound method — no lambda needed. It reads cleanly and avoids the explicit generator expression. This approach is equivalent to the list comprehension but reads closer to a declarative description of what you want.
FAQ
Which method is faster, isdigit() or regex?
For simple character-by-character extraction of all digits, isdigit() with join() is faster because it avoids the regex engine’s overhead. For pattern-matched extraction (like “extract only 4-digit groups”), regex is faster to write and the performance difference is negligible for typical string lengths. Profile your specific use case if performance is critical.
Does isdigit() handle negative numbers?
No. The minus sign “-” is not a digit, so calling isdigit() on “-42” returns False for the whole string. The regex module handles signed integers with the pattern -?\d+:
import re
text = "Temp: -5C, Pressure: +1013hPa, reading 42"
signed = re.findall(r'-?\d+', text)
print(signed)
How do I extract digits from a string with decimals like “3.14”?
Use the regex pattern \d+\.?\d* to capture the integer part, optional decimal point, and fractional part as a single unit:
import re
text = "Pi is approximately 3.14159 and e is 2.71828"
decimals = re.findall(r'\d+\.?\d*', text)
print(decimals)
floats = [float(d) for d in decimals]
print(floats)
['3.14159', '2.71828']
[3.14159, 2.71828]
What about Unicode digits like “2” (fullwidth two)?
isdigit() returns True for Unicode digits including fullwidth forms (U+FF10 through U+FF19), superscripts like “\u00B2” (squared), and Roman numeral characters. If you only want ASCII digits 0-9, use '0' <= c <= '9' instead of isdigit(). Regex \d in Python 3 matches Unicode digits by default — pass the ASCII flag or use [0-9] for ASCII-only matching.
Can I extract digits only at the start or end of a string?
Use regex anchors. re.findall(r'^\d+', s) extracts leading digits only; re.findall(r'\d+$', s) extracts trailing digits only:
import re
text = "2024-Annual-Report-Final"
leading = re.findall(r'^\d+', text)
trailing = re.findall(r'\d+$', text)
print(f"Leading: {leading}")
print(f"Trailing: {trailing}")
Leading: ['2024']
Trailing: ['Final']
The ^ anchor locks the match to the start of the string; $ anchors it to the end. Neither works with isdigit() without additional logic.

