lp.read_csv

Reads a CSV file and returns a LazyFrame.

Parameters:

Name	Type	Description	Default
`path_or_buffer`	`str \| StringIO \| TextIOBase`	Path to the CSV file or a buffer-like object.	required
`header`	`bool \| int \| None`	Indicates whether the CSV file has a header row. Can be a boolean or the row number of the header. Defaults to None.	`None`
`compression`	`str \| None`	Compression type of the file. Options are 'none', 'gzip', or 'zstd'. Defaults to None.	`None`
`sep`	`str \| None`	Character that separates columns. Alias for 'delimiter'. Defaults to None.	`None`
`delimiter`	`str \| None`	Character that separates columns. Defaults to None.	`None`
`dtype`	`dict[str, str] \| list[str] \| None`	Specifies column data types. Can be a dictionary with column names and types, or a list of types. Defaults to None.	`None`
`na_values`	`str \| list[str] \| None`	Values to interpret as NA/NaN. Defaults to None.	`None`
`skip_rows`	`int \| None`	Number of lines to skip at the start of the file. Defaults to None.	`None`
`quote_char`	`str \| None`	Character used for quoting. Defaults to None.	`None`
`escape_char`	`str \| None`	Character used for escaping. Defaults to None.	`None`
`encoding`	`str \| None`	File encoding. Defaults to None.	`None`
`parallel`	`bool \| None`	Enables or disables parallel reading. Defaults to None.	`None`
`date_format`	`str \| None`	Format to use when parsing dates. Defaults to None.	`None`
`timestamp_format`	`str \| None`	Format to use when parsing timestamps. Defaults to None.	`None`
`sample_size`	`int \| None`	Number of rows to sample for type inference. Defaults to None.	`None`
`all_varchar`	`bool \| None`	If True, assumes all columns are of type VARCHAR, skipping type inference. Defaults to None.	`None`
`normalize_names`	`bool \| None`	Normalizes column names to lowercase and replaces spaces with underscores. Defaults to None.	`None`
`null_padding`	`bool \| None`	If True, adds null padding to text columns. Defaults to None.	`None`
`names`	`list[str] \| None`	List of column names to use. Defaults to None.	`None`
`line_terminator`	`str \| None`	Character that indicates the end of a line. Defaults to None.	`None`
`columns`	`dict[str, str] \| None`	Dictionary specifying column names and types in the CSV file. Defaults to None.	`None`
`auto_type_candidates`	`list[str] \| None`	List of types for the parser to consider during type inference. Defaults to None.	`None`
`max_line_size`	`int \| None`	Maximum size of a line in the CSV file. Defaults to None.	`None`
`ignore_errors`	`bool \| None`	If True, ignores errors during CSV reading. Defaults to None.	`None`
`store_rejects`	`bool \| None`	If True, stores rejected lines during reading. Defaults to None.	`None`
`rejects_table`	`str \| None`	Name of the table to store rejected lines. Defaults to None.	`None`
`rejects_scan`	`str \| None`	Path to store the scan of rejected lines. Defaults to None.	`None`
`rejects_limit`	`int \| None`	Limit of rejected lines before stopping the read. Defaults to None.	`None`
`force_not_null`	`list[str] \| None`	List of columns that should not be interpreted as NULL. Defaults to None.	`None`
`buffer_size`	`int \| None`	Size of the read buffer. Defaults to None.	`None`
`decimal`	`str \| None`	Decimal separator for numbers. Defaults to None.	`None`
`allow_quoted_nulls`	`bool \| None`	If True, allows conversion of quoted values to NULL. Defaults to None.	`None`
`include_filename`	`bool \| str \| None`	If True or a string, includes the filename in the output. Defaults to None.	`None`
`hive_partitioning`	`bool \| None`	Enables Hive partitioning. Defaults to None.	`None`
`union_by_name`	`bool \| None`	If True, unions files by column name. Defaults to None.	`None`
`hive_types`	`dict[str, str] \| None`	Dictionary specifying Hive types for columns. Defaults to None.	`None`
`hive_types_autocast`	`bool \| None`	If True, automatically casts Hive types. Defaults to None.	`None`
`parse_dates`	`list[str] \| None`	List of column names to parse as dates. Defaults to None.	`None`

Returns:

Name	Type	Description
`LazyFrame`	`LazyFrame`	A LazyFrame containing the data from the CSV file.

Example:

import lazy_pandas as lp
df = lp.read_csv('data.csv', header=True, sep=',', dtype={'column1': 'INTEGER', 'column2': 'VARCHAR'})
df.head()