I recently finished writing two assets — a Spark-based data ingestion framework and a Spark-based data quality framework; both were metadata-driven. As is typical, the behavior of the assets is stored in an RDBMS. In the data ingestion framework, I needed to store parameters for the source (information like username, password, path, format, etc), the destination (information like username, password, path, format, etc), compression, etc. In normal schemas, I have seen these parameters modeled as columns in a table.
Being a programmer at heart, I decided not to use multiple columns. Instead, all the parameters would be stored in a single column (as a string in the database table). The Spark application would have the responsibility of reading the string and extracting the required parameters.