BigData Streaming Platform
Auto Schema detection/generation
Getting rid of schema registry all together could be the best solution. Rather than having a registry, the schema’s could be detected from the data itself and as the schema evolves that could be fed to machine learning lib to predict the schema based on incoming stream.
Unified Serialization approach
Having Avro as a standard serialization protocol can get rid of lot of interoperability between various Opensource systems. Bring all data does not matter the originating source should be converted into Avro before landing into Kafka.
Kafka and Spark for ETLs
It is time to say goodbye to earlier ETL tools and embrace Spark for all heavy computation with Kafka to push the data across the platform.