Product Roadmap
Sunbird Obsrv 2.0 has been redesigned from ground up to allow ingestion, processing and querying of the telemetry data to be agnostic of the data specification of the telemetry data. The last stable release of Sunbird Obsrv 1.0, which is tightly coupled of the Sunbird Telemetry Specification will be Release 5.1.1. The product roadmap for Sunbird Obsrv 2.0 has been detailed out below with the features being logically grouped under specific functional features.
Sunbird Obsrv ISSUE TRACKER : This is the link to the set of issues/ submissions or requests that are being considered for development as part of the Sunbird Obsrv roadmap. You can upvote an issue if you find it relevant, or add a new issue to the list
Obsrv AMJ-2024 Stories
APIs Refactoring
Dataset CRUD API
Dataset Management API
Data In Out APIs
Query Template APIs
Automation Refactoring
Helm charts refactoring
Wrapper over helm charts
Connectors
Connectors Framework
Connectors Management
Connectors Implementation
Hudi Integration
Create an ingestion spec for Lakehouse tables
Streaming job implementation to write Raw Data to Lakehouse
Timestamp Based Partitioner for Apache Hudi
Hudi Sink Configuration Optimization
Deployment Automation for Hudi Sink Connector - AWS/Local Datacenter
Query API Unification for the Lakehouse/Real-Time store - Design
Query API Unification for the Lakehouse/Real-Time store - Implementation
Dedup Challenges with introduction of Lakehouse - Design
Dedup Challenges with introduction of Lakehouse - Implementation
Design Rollups for the Lakehouse
Rollups implementation for the Lakehouse
Deployment Automation for Hudi Sink Connector - Azure
Deployment Automation for Hudi Sink Connector - GCP
Obsrv 2.0.1 GA Release date - 29th Feb'24
Enhancements
Core pipeline enhancments
To handle denormalization logic for empty keys, text & numeric keys.
Moved failed events sinking into a common base class.
Updated framework to created dynamicKafkaSink object.
Master dataset processor can now do denormalization with another master dataset as well.
Tech software upgrades - Postgres, Druid, Superset, NodeJS, Kubernetes.
Dataset Management - Added timezone handling to store the data in druid in the TZ specified by the dataset.
Infra reliability - These feature enhances data management capabilities and provides flexibility in dataset maintenance.
Verification of metrics for various services.
Metrics intrumentation for few of the services.
Enhancments to dataset API service - API endpoint changes.
Benchmarking - Processing, Ingestion & Querying.
Bug Fixes and Improvements
Fix unit tests and improve coverage.
Automation script enhancements to support new changes.
Feature
Connector Framework implementation
Hudi Data Ingestion
Obsrv 2.0.0 GA Release date - 31st Dec'23
360 degree observability
Enhancements in datasets management - These feature enhances data management capabilities and provides flexibility in dataset maintenance.
Users can now delete draft datasets using the API.
User can now retire live datasets using the API.
Ability to configure the denorm on the master datasets
Core pipeline enhancments - Core pipeline changes to route all the failed events with detailed summary to failed topic.
Detailed Debugging with Query Store - Indexing all failed events into the query store for comprehensive and detailed debugging. This feature improves troubleshooting capabilities and accelerates issue resolution.
Automated Backup Configuration - Automated the configuration of the backup system during installation, allowing users to define their preferred timezone. This ensures a seamless and customized backup setup for enhanced data protection.
Aggregated and Filtered Data Sources - Introduced the option to create aggregated data sources and filtered data sources through the API. Also enhanced API to query on the rollup datasources. This feature provides users with more control over data sources, enabling customization based on specific requirements.
Filtered Rollups - Ability to create rollup on a filter using API
Bug Fixes and Improvements
Fixed the auto-conversion of the timestamp property to a string.
Automatically convert numeric field to Integer during ingestion into Query Store/Druid.
Improve the code coverage of obsrv core to 100%
Connector Ecosystem
These connectors enable the data IN and OUT of the platform and expand the reach of our platform.
JDBC connector - This connector supports popular databases such as MySQL, PostgreSQL.
Data Stream Source Connector - This connector supports real-time streaming data sources such as Apache Kafka.
Obsrv 2.2.0 Release date - 31st Oct'23
360 degree observability
Addressing major vulnerabilities, making Obsrv BB free from vulnerabilities.
Restarting Flink to pick up new datasets. The command service with the Flink restart command needs to be open-sourced.
Few Bug fixes with components & deployment.
Obsrv 2.1.0 Release date - 30th Sep'23
360 degree observability
Restarting Flink to pick up new datasets. The command service with the Flink restart command needs to be open-sourced.
Refactor of Sunbird Ed Cache Updater Jobs
Enhance the Device register/profile API to send transaction event to obsrv 2.0 system using data IN API or Kafka connector
Connector Ecosystem
Object Storage Connectors - Cloud storages
MinIO connector
Connector Marketplace/Framework
Postgresql Connector for data & denormalization
Obsrv 2.1.0 Release date - 31st Aug'23
360 degree observability
Data Exhaust API - Ability to download the raw data using the Data Exhaust APIs
Druid Query and Ingestion Submission Wrapper APIs - Ability to query and ingestion spec submission wrapper API
APIs System to generate the Data IN/OUT Metrics Generation
Integrate the Obsrv superset with Query Wrapper APIs
Simplified Operations
Configure the obsrv to run with MinIO object store
Support on multi channel alerts
Enabling the labels for all the services
Obsrv 2.0.0 GA(Planned release date - 2 May'23)
360 degree observability
Data Set Creation via APIs
Data Denormalization with API & Push to Kafka
Real time querying
Druid SQL & JSON query interface
Out-of-the-box visualizations with Superset
Simplified Operations
One-click installation for AWS, Azure & GCP
Standard Monitoring Capability
Log Streaming with Grafana UI
Connector Ecosystem
Kafka connector for data & denormalization
Unified Web Console
General Cluster metrics
Release-5.1.0 (Planned release date - 04 Nov'22) - Projects
Project : Enabling ease of adoption
Task : One click install enhancements
a. Include the data products (work flow summary) as part of the one-click installer package. These allow for easy generation of common summaries once raw telemetry is generated by the adopter.
b. Include a sample data set and some pre-configured charts as part of the one-click installer package so that an adopter trying it out can get a feel of the types of charts that can be generated, and the kind of data required to be generated for this purpose.
Project: Enabling ease of adoption
Task : Additional documentation
Additional pending Sunbird Obsrv documentation work
Project: Enabling ease of adoption
Task: Creation of a Learning module for Sunbird Obsrv
Compile existing learning resources/ create new resources in order to put together a learning module/ course that will allow a developer to get familiar with the Obsrv building block, and potentially get "Obsrv certified" by earning a certificate after taking an assessment.
4.10.0 : Planned for 06 Jun '22
1. Functional Requirements: 4.10.0
- API level accsess to data files that power the reports and charts on the portal
- Ability to configure Sunbird datasets as Public or Private at an instance level
- KT for indexing variables into Druid
2. Deployment and Release Processes: 4.10.0
Build, Deploy and provisioning scripts : refactoring of the provisioning and deployment scripts for the BB- Sunbird dev and staging environments will be repurposed for each BB. The deployment scripts need to be refactored to sandbox the environment on Kubernetes for Sunbird Obsrv BB. This will be done by deploying the services and components on Kubernetes onto a separate configurable namespace for Sunbird BB
5.0.0 : Planned for 19 Aug'22
1. One click mini installation of data pipeline components on Kubernetes: 5.0.0
- Currently not all components are deployed onto Kubernetes and it takes significant effort from the adopter to get all the components up and running. This capability will allow the adopter to quickly install required components on Kubernetes and have the entire analytics platform up and running.
2. Publish the minimum number of Kubernetes nodes required for one click installation: 5.0.0
- Analyse and publish minimum number of nodes required to get the analytics platform up and running on Kubernetes. Also, publish the mandatory components required to get the installation up and running.
3. Migration of API Swagger documentation to Sunbird Obsrv Building block pages: 5.0.0
- Migrate existing Sunbrid Swagger API for analytics API and data exhaust APIs documentation to Sunbird Obsrv Building block pages.
4. Multi-cloud support for blob store: 5.0.0
Update the framework, cloud-storage sdk, Secor and Flink checkpoints to work with GCP as well- Generalize the analytics framework, cloud-storage-sdk, Secor and Flink checkpoints to work with AWS S3, Azure Blob Storage and Google Cloud Storage- Add relevant configuration files required for the generalization
- Be able to deploy existing microservices into a different namespace (SB Ed)
1. Documentation to configure data exhaust reports using the APIs: Detailed explanation of the different sections of the configuration schema.
- Add documentation to explain the configuration schema for the various data exhaust report API
- Add detailed documentation on the usage of the APIs
Last updated