lp.read_csv
Reads a CSV file and returns a LazyFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path_or_buffer
|
str | StringIO | TextIOBase
|
Path to the CSV file or a buffer-like object. |
required |
header
|
bool | int | None
|
Indicates whether the CSV file has a header row. Can be a boolean or the row number of the header. Defaults to None. |
None
|
compression
|
str | None
|
Compression type of the file. Options are 'none', 'gzip', or 'zstd'. Defaults to None. |
None
|
sep
|
str | None
|
Character that separates columns. Alias for 'delimiter'. Defaults to None. |
None
|
delimiter
|
str | None
|
Character that separates columns. Defaults to None. |
None
|
dtype
|
dict[str, str] | list[str] | None
|
Specifies column data types. Can be a dictionary with column names and types, or a list of types. Defaults to None. |
None
|
na_values
|
str | list[str] | None
|
Values to interpret as NA/NaN. Defaults to None. |
None
|
skip_rows
|
int | None
|
Number of lines to skip at the start of the file. Defaults to None. |
None
|
quote_char
|
str | None
|
Character used for quoting. Defaults to None. |
None
|
escape_char
|
str | None
|
Character used for escaping. Defaults to None. |
None
|
encoding
|
str | None
|
File encoding. Defaults to None. |
None
|
parallel
|
bool | None
|
Enables or disables parallel reading. Defaults to None. |
None
|
date_format
|
str | None
|
Format to use when parsing dates. Defaults to None. |
None
|
timestamp_format
|
str | None
|
Format to use when parsing timestamps. Defaults to None. |
None
|
sample_size
|
int | None
|
Number of rows to sample for type inference. Defaults to None. |
None
|
all_varchar
|
bool | None
|
If True, assumes all columns are of type VARCHAR, skipping type inference. Defaults to None. |
None
|
normalize_names
|
bool | None
|
Normalizes column names to lowercase and replaces spaces with underscores. Defaults to None. |
None
|
null_padding
|
bool | None
|
If True, adds null padding to text columns. Defaults to None. |
None
|
names
|
list[str] | None
|
List of column names to use. Defaults to None. |
None
|
line_terminator
|
str | None
|
Character that indicates the end of a line. Defaults to None. |
None
|
columns
|
dict[str, str] | None
|
Dictionary specifying column names and types in the CSV file. Defaults to None. |
None
|
auto_type_candidates
|
list[str] | None
|
List of types for the parser to consider during type inference. Defaults to None. |
None
|
max_line_size
|
int | None
|
Maximum size of a line in the CSV file. Defaults to None. |
None
|
ignore_errors
|
bool | None
|
If True, ignores errors during CSV reading. Defaults to None. |
None
|
store_rejects
|
bool | None
|
If True, stores rejected lines during reading. Defaults to None. |
None
|
rejects_table
|
str | None
|
Name of the table to store rejected lines. Defaults to None. |
None
|
rejects_scan
|
str | None
|
Path to store the scan of rejected lines. Defaults to None. |
None
|
rejects_limit
|
int | None
|
Limit of rejected lines before stopping the read. Defaults to None. |
None
|
force_not_null
|
list[str] | None
|
List of columns that should not be interpreted as NULL. Defaults to None. |
None
|
buffer_size
|
int | None
|
Size of the read buffer. Defaults to None. |
None
|
decimal
|
str | None
|
Decimal separator for numbers. Defaults to None. |
None
|
allow_quoted_nulls
|
bool | None
|
If True, allows conversion of quoted values to NULL. Defaults to None. |
None
|
include_filename
|
bool | str | None
|
If True or a string, includes the filename in the output. Defaults to None. |
None
|
hive_partitioning
|
bool | None
|
Enables Hive partitioning. Defaults to None. |
None
|
union_by_name
|
bool | None
|
If True, unions files by column name. Defaults to None. |
None
|
hive_types
|
dict[str, str] | None
|
Dictionary specifying Hive types for columns. Defaults to None. |
None
|
hive_types_autocast
|
bool | None
|
If True, automatically casts Hive types. Defaults to None. |
None
|
parse_dates
|
list[str] | None
|
List of column names to parse as dates. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
LazyFrame |
LazyFrame
|
A LazyFrame containing the data from the CSV file. |
Example: