Save each row in Spark Dataframe into different file

Multi tool use


Save each row in Spark Dataframe into different file
I construct a spark DataFrame using with the following structure:
root
|-- tickers: string (nullable = true)
|-- name: string (nullable = true)
|-- price: array (nullable = true)
| |-- element: map (containsNull = true)
| | |-- key: string
| | |-- value: map (valueContainsNull = true)
| | | |-- key: string
| | | |-- value: string (valueContainsNull = true)
I want to save each object in price
into a separate JSON/CSV file and have each saved file using the corresponding name
string as filename. Is there a way to implement this in a Python environment?
price
name
The most relevant solution I find is to repartition the dataframe into patitions of number of "rows" in dataframe, and use .write.csv()
(see https://stackoverflow.com/a/49890590/6158414). But this doesn't fit my need to save "rows" into separate files with different filenames.
.write.csv()
To give more contexts. I'm using spark to call API and retrieve data in parallel. Each "row" in the spark dataframe is data query based on a unique values of tickers
. The last step of my process is to save each query result separately. Would also appreciate it if someone have better way to do this.
tickers
Thanks a lot!
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.