Second Half 2022 Tech Predictions for Information and AI


As we emerge from the halftime present that’s the yr 2022, it’s time to take inventory of the place we’ve come this yr in huge knowledge, superior analytics, and AI, and assess the place we’re prone to go subsequent.

Primarily based on the place we’ve been to date in 2022, Datanami feels assured in making these 5 predictions for the rest of the yr.

Information Observability Continues to Run

The primary half of the yr was enormous for knowledge observability, which supplies prospects higher visibility and metrics on what’s happening with knowledge streams. As knowledge turns into extra essential for decision-making, the well being and usefulness of that knowledge turns into extra essential too.

We noticed a lot of knowledge observability startups gaining lots of of hundreds of thousands of {dollars} in enterprise funding, together with Cribl (Sequence D value $150 million); Monte Carlo (Sequence D value $135 million); Coralogix (Sequence D value $142 million); and others. Others making information embrace Bigeye, which rolled out metadata metrics; StreamSets, which was purchased by Software program AG for $580 million; and IBM, which purchased observability startup Databand las tmonth.

This momentum will proceed within the second half of 2022, as extra knowledge observability startups come out of the woods and current ones search to solidify their place on this nascent market.

Is real-time knowledge poised for a surge? (Blue Planet Studio/Shutterstock)

Actual-Time Information Pops

Actual time knowledge has been sitting on the again burner for years, serving some area of interest use instances however actually not seeing widespread use amongst common companies. However due to the COVID pandemic and related shake-up in enterprise plans over the previous couple of years, the circumstances at the moment are ripe for actual time knowledge to make the bounce into mainstream tech circles.

“I feel streaming is lastly occurring,” Databricks CEO Ali Ghodsi mentioned on the latest Information + AI Summit, noting a 2.5X progress in streaming workloads on the corporate’s cloud-based knowledge platform. “They’re having increasingly more AI use instances that simply have to be real-time.”

In-memory databases and in-memory knowledge grids are additionally poised to profit from the true time renaissance (if that’s what it’s). RocksDB, a speedy analytics database that has augmented event-based programs like Kafka, now has a drop-in substitute referred to as Speedb. SingleStore, which mixes OLTP and OLAP capabilities in a single relational framework, hit a $1.3 billion valuation in a funding spherical final month.

There’s additionally StarRocks, which just lately bought funded for a speedy new OLAP database based mostly on Apache Doris; Suggest, which cleared a $100 million Sequence D in Might to proceed its Apache Druid-based real-time analytics enterprise; and DataStax, which added Apache Pulsar to its Apache Cassandra package, raised $115 million to drive real-time utility growth. Datanami expects this deal with real-time knowledge evaluation to proceed.

Regulatory Development

It’s been 4 years since GDPR went into impact, placing cavalier huge knowledge customers on discover and hastening the rise of knowledge governance as a essential ingredient in accountable knowledge packages. Within the US, the duty of regulating knowledge entry has fallen to the states, and California is main the way in which with CCPA, which mimics the GPDR in some ways. However extra states are prone to observe swimsuit, complicating the info privateness equation for US corporations.

However GDPR and CCPA are only the start of the rules. We’re additionally within the midst of the loss of life of the third-party cookie, which is making it more durable for corporations to trace what customers do on-line. Google’s resolution to delay the tip of third-party cookies on its platform till January 1, 2023 gave entrepreneurs some further time to adapt, however the info from the cookies will probably be powerful to duplicate.

Along with knowledge rules, we’re on the cusp of recent rules on using AI. The European Union launched the AI Act in 2021, and specialists predict it may turn into legislation by the tip of 2022 or early 2023.

Battle of the Information Desk Codecs

A traditional tech battle is shaping up over new knowledge desk codecs that may decide how knowledge is saved in huge knowledge programs, who can entry it, and what customers can do with it.

Apache Iceberg has gained steam in latest months as a possible new normal for knowledge desk codecs. Cloud knowledge warehouse giants Snowflake and AWS got here out early this yr in help of Iceberg, which offers transactions and different controls on knowledge and emerged from work at Netflix and Apple. Cloudera, the previous Hadoop distributor, additionally backed Iceberg in June.

However the people at Databricks are providing an alternate within the Delta Lake desk format, which affords comparable capabilities as Iceberg. The Apache Spark backers initially developed Delta Lake desk format in a proprietary method, which led to accusations that Databricks was setting prospects up for lock-in. However on the Information + AI Summit in June, the corporate opened introduced it was committing everything of the format to open supply, thereby letting anybody use it.

Misplaced within the shuffle is Apache Hudi, which additionally offers consistency in knowledge because it sits in huge knowledge repositories and is accessed by varied compute engines. Onehouse, a enterprise backed by Apache Hudi’s creators, launched earlier this yr with a Hudi-based lakehouse platform.

The large knowledge ecosystem loves competitors, so will probably be attention-grabbing to observe these codecs evolve and battle it out over the remainder of 2022.

Language AI Continues to Wow

The slicing fringe of AI is getting sharper by the month, and in the present day, the tip of the AI spear is the big language fashions, which preserve getting higher. In actual fact, the big language fashions have gotten so good {that a} Google engineer in June claimed that the corporate’s LaMDA conversational system had turn into sentient.

The AI isn’t sentient but, however that doesn’t imply they’re not helpful to the enterprise. We’re reminded that Salesforce has a big langauge mannequin (LLM) mission referred to as CodeGen, which seeks to perceive supply code and even generate its personal code in numerous programming languages.

Final month, Meta (the father or mother firm of Fb) unveiled a big language mannequin that may translate amongst 200 languages.  We’ve additionally seen efforts to democratize AI via tasks like BigScience Massive Open-science Open-access Multilingual language mannequin,” or BLOOM.

What are your predictions for the remainder of 2022? Contact us to tell us.