Categories
Mastering Development

pyspark writing jdbc times out

So basically I am using pyspark (jdbc format) to read tables in form a database and then write that data to an Azure Data Lake. The code that I’ve written works, except for the very large tables (400k rows, 50 cols) with the following error: Py4JJavaError: An error occurred while calling o94.parquet. : org.apache.spark.SparkException: Job […]