Save each row in Spark Dataframe into different file

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Save each row in Spark Dataframe into different file



I construct a spark DataFrame using with the following structure:


root
|-- tickers: string (nullable = true)
|-- name: string (nullable = true)
|-- price: array (nullable = true)
| |-- element: map (containsNull = true)
| | |-- key: string
| | |-- value: map (valueContainsNull = true)
| | | |-- key: string
| | | |-- value: string (valueContainsNull = true)



I want to save each object in price into a separate JSON/CSV file and have each saved file using the corresponding name string as filename. Is there a way to implement this in a Python environment?


price


name



The most relevant solution I find is to repartition the dataframe into patitions of number of "rows" in dataframe, and use .write.csv() (see https://stackoverflow.com/a/49890590/6158414). But this doesn't fit my need to save "rows" into separate files with different filenames.


.write.csv()



To give more contexts. I'm using spark to call API and retrieve data in parallel. Each "row" in the spark dataframe is data query based on a unique values of tickers. The last step of my process is to save each query result separately. Would also appreciate it if someone have better way to do this.


tickers



Thanks a lot!









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Visual Studio Code: How to configure includePath for better IntelliSense results

Spring cloud config client Could not locate PropertySource

Makefile test if variable is not empty