What is This?

This script takes an input CSV file and outputs a copy of the CSV file with particular columns removed. Provide cols_to_remove with a list containing the indexes of columns in the CSV file that you want to be removed (starting from index 0 - so the first column would be 0).

I have used indexes because they are easier to use programmatically and if I did use header titles, one character or space would have put it off. This method scales very well and can remove columns where MS Excel can't open a large file.

Code

import csv

input_file = 'input.csv'
output_file = 'output.csv'
cols_to_remove = [1, 4, 10, 11] # Column indexes to be removed (starts at 0)

cols_to_remove = sorted(cols_to_remove, reverse=True) # Reverse so we remove from the end first
row_count = 0 # Current amount of rows processed

with open(input_file, "r") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='') as result:
        writer = csv.writer(result)
        for row in reader:
            row_count += 1
            print('\r{0}'.format(row_count), end='') # Print rows processed
            for col_index in cols_to_remove:
                del row[col_index]
            writer.writerow(row)
  • CSV Splitter: Splits a CSV file into multiple files with a row limit per file
  • CSV to Sqlite: A command-line tool that copies data from CSV files into a SQLite database.