I ran into a situation last year where I had a pandas DataFrame with 50 columns and needed to pull out specific columns by their numerical position. The column names were generated programmatically and I had no way to know them ahead of time. That’s when I reached for iloc.

This article covers how iloc() works, how to select rows and columns by integer position, and the common mistakes that trip people up. By the end, you’ll be able to slice any DataFrame precisely without knowing its column names. The pandas library provides iloc as part of its standard API for this kind of positional selection.

TLDR

  • iloc() selects data by integer position, not by label – row 0 is always the first row
  • Syntax: df.iloc[row_selector, col_selector] – both accept integers, lists, slices, boolean arrays
  • Slice notation excludes the end boundary (df.iloc[0:2] returns rows 0 and 1, not row 2)
  • Out-of-range positions raise IndexError – check df.shape before indexing if uncertain
  • For label-based selection, use loc() instead of iloc()

What is iloc() in Python?

iloc() is a pandas DataFrame method that selects rows and columns by their integer position. The name stands for “integer location” – it works purely with positional indices, not with the row or column labels you see in the DataFrame header. This makes it especially useful when working with DataFrames where column names are unknown at runtime, generated dynamically, or when processing data imported from sources like CSV files where column order is more predictable than names.

Under the hood, pandas assigns a zero-based integer index to every row and column regardless of what labels they carry. iloc() reads those positions directly, which is why it behaves differently from loc(), the label-based counterpart. You can read more about pandas DataFrames in the Python DataFrame guide.

Syntax and Parameters

The iloc() method accepts up to two arguments separated by a comma. The first argument selects rows and the second selects columns.


DataFrame.iloc[row_selector, col_selector]


# Returns the entire DataFrame when no selectors are provided
df.iloc[:, :]  # equivalent to df

Both row_selector and col_selector accept these types of values:

  • Single integer – df.iloc[0] selects the first row
  • List of integers – df.iloc[[0, 2, 5]] selects rows 0, 2, and 5
  • Slice object – df.iloc[1:4] selects rows 1, 2, and 3 (end is excluded)
  • Boolean array – df.iloc[mask] selects rows where the mask is True

If you omit the column selector, all columns are returned for the selected rows. The selector types are the same for both rows and columns.

Selecting Rows

Row selection in iloc() can be done with a single integer, a list of integers, or a slice. Each approach returns a different subset of rows from the DataFrame.


import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'department': ['Engineering', 'Sales', 'Marketing'],
    'salary': [85000, 72000, 69000]
})
print(df)


      name department  salary
0    Alice  Engineering   85000
1      Bob      Sales   72000
2  Charlie   Marketing   69000

Single row: Passing a single integer returns a pandas Series with column names as the index.


name           Alice
department    Engineering
salary          85000
Name: 0, dtype: object

Multiple specific rows: A list of integers selects those positions in the order given.


      name department  salary
0    Alice  Engineering   85000
2  Charlie   Marketing   69000

Slice of rows: A slice object selects a continuous range. The start is included and the end is excluded.


    name department  salary
1    Bob      Sales   72000
2  Charlie  Marketing   69000

The slice 1:3 returns rows at positions 1 and 2, excluding position 3. This is a common point of confusion coming from languages where slices are inclusive at both ends.

Selecting Columns

Add a second argument to specify which columns to select. The column selector uses the same integer-based logic as row selection – it refers to position, not name. You can read more about Python data types to understand how pandas stores different kinds of data in columns.


# Select second column only (position 1)
print(df.iloc[:, 1])


0      Engineering
1           Sales
2      Marketing
Name: department, dtype: object

The colon on the left side selects all rows, and 1 selects the column at position 1, which is the department column. You can also pass a list to select multiple columns.


# Select first and third columns by position
print(df.iloc[:, [0, 2]])


      name  salary
0    Alice   85000
1      Bob   72000
2  Charlie   69000

Passing [0, 2] as the column selector returns only the name and salary columns, skipping the middle column. The positions refer to the original DataFrame column order, not alphabetical or any other ordering.

Selecting Rows and Columns Together

The real strength of iloc() is combining row and column selectors simultaneously to extract rectangular slices from a DataFrame.


# Select rows 0 and 2, columns 0 and 2 simultaneously
print(df.iloc[[0, 2], [0, 2]])


      name  salary
0    Alice   85000
2  Charlie   69000

Both selectors are applied at the same time, returning the intersection of the specified row and column positions. Slices also work on both axes.


# Slice rows 1-2 and columns 1-2
print(df.iloc[1:3, 1:3])


    department  salary
1       Sales   72000
2    Marketing   69000

Using slices on both axes selects a rectangular region. The row slice 1:3 excludes row 3 and the column slice 1:3 excludes column 3, so only the middle and last columns appear for rows 1 and 2.

Boolean Indexing

Pass a boolean array to the row selector to filter rows based on a condition. The array must have one value per row in the DataFrame. This technique is commonly used when filtering data loaded from external sources like CSV files, which you can read more about in the Python CSV module guide.


mask = [True, False, True]
print(df.iloc[mask])


      name department  salary
0    Alice  Engineering   85000
2  Charlie   Marketing   69000

A more practical use is deriving the mask from the DataFrame itself. For example, filtering rows where the salary column exceeds 70000:


salary_threshold = df.iloc[:, 2] > 70000
print(df.iloc[salary_threshold.values])


    name department  salary
0   Alice  Engineering   85000
1     Bob      Sales   72000

The mask checks the third column (position 2, the salary column) for values above 70000, which returns Alice and Bob. This pattern works well when the filtering condition is computed from the data itself rather than hardcoded.

FAQ

Q: How does iloc() differ from loc()?

iloc() selects by integer position – row 0 is always the first row regardless of the row label. loc() selects by row and column labels – it returns the row with the matching label. Using loc() with an integer label can produce unexpected results because it expects a label, not a position. For example, if a DataFrame has row labels [5, 10, 15], then df.iloc[0] returns the first row while df.loc[0] raises a KeyError since no row has label 0.

Q: Does iloc() support negative indexing?

Yes. Like Python lists, negative indices count from the end. df.iloc[-1] returns the last row, df.iloc[-2] returns the second-to-last. Passing an index like -len(df) or smaller raises IndexError.

Q: Can iloc() modify data in place?

Yes. Assignment works the same way as regular pandas indexing – df.iloc[row_selector, col_selector] = value modifies the DataFrame in place. This works for single cells, entire rows, or rectangular regions depending on the selectors used.

Q: What happens when iloc() is called with no arguments?

df.iloc[] returns the entire DataFrame. While functionally equivalent to df[:], the empty indexer is rarely used in practice since df alone achieves the same thing more clearly.

Q: Why does df.iloc[0:1] return only one row when 0:2 would return two?

iloc() uses Python’s standard slice convention where the start is included and the end is excluded. df.iloc[0:1] returns rows from position 0 up to but not including position 1, which is just row 0. To get two rows starting from position 0, use df.iloc[0:2]. This differs from some languages that include both endpoints in a range.

iloc() is the tool I reach for whenever I need to work with positional data in a DataFrame. It pairs well with situations where column names are unknown or generated programmatically, or when processing data where column order is more reliable than column names. The main thing to keep in mind is that it works with positions, not labels – once that distinction is clear, iloc() becomes one of the most predictable parts of the pandas API.

Share.
Leave A Reply