Sunbird Obsrv
  • Introduction
    • The Value of Data
    • Data Value Chain
    • Challenges
    • The Solution: Obsrv
  • Core Concepts
    • Obsrv Overview
    • Key Capabilities
    • Datasets
    • Connectors
    • High Level Architecture
    • Tech Stack
    • Monitoring
  • Explore
    • Roadmap
    • Case Studies
      • Agri Climate Advisory
      • Learning Analytics at Population Scale
      • IOT Observations Infra
      • Data Driven Features in Learning Platform
      • Network Observability
      • Fraud Detection
    • Performance Benchmarks
  • Guides
    • Installation
      • AWS Installation Guide
      • Azure Installation Guide
      • GCP Installation Guide
      • OCI Installation Guide
      • Data Center Installation Guide
    • Dataset Management APIs
    • Dataset Management Console
    • Connector APIs
    • Data In & Out APIs
    • Alerts and Notification Channels APIs
    • Developer Guide
    • Example Datasets
    • Connectors Developer Guide
      • SDK Assumptions
      • Required Files
        • metadata.json
        • ui-config.json
        • metrics.yaml
        • alerts.yaml
      • Obsrv Base Setup
      • Dev Requirements
      • Interfaces
        • Stream Interfaces
        • Batch Interfaces
      • Classes
        • ConnectorContext Class
        • ConnectorStats Class
        • ConnectorState Class
        • ErrorData Class
        • MetricData Class
      • Verifying
      • Packaging Guide
      • Reference Implementations
    • Coming Soon!
  • Community
  • Previous Versions
    • SB-5.0 Version
      • Overview
      • USE
        • Release Notes
          • Obsrv 2.0-Beta
          • Obsrv 2.1.0
          • Obsrv 2.2.0
          • Obsrv 2.0.0-GA
          • Obsrv 5.3.0-GA
          • Release V 5.1.0
          • Release V 5.1.2
          • Release V 5.1.3
          • Release V 5.0.0
          • Release V 4.10.0
        • Installation Guide
        • Obsrv 2.0 Installation Guide
          • Getting Started with Obsrv Deployment Using Helm
        • System Requirements
      • LEARN
        • Functional Capabilities
        • Dependencies
        • Product Roadmap
        • Product & Developer Guide
          • Telemetry Service
          • Data Pipeline
          • Data Service
          • Data Product
            • On Demand Druid Exhaust Job
              • Component Diagram
              • ML CSV Reports
              • Folder Struture
          • Report Service
          • Report Configurator
          • Summarisers
      • ENGAGE
        • Discuss
        • Contribute to Obsrv
      • Raise an Issue
  • Release Notes
    • Obsrv 1.1.0 Beta Release
    • Obsrv 1.2.0-RC Release
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. Previous Versions
  2. SB-5.0 Version
  3. LEARN
  4. Product & Developer Guide
  5. Data Product
  6. On Demand Druid Exhaust Job

Component Diagram

PreviousOn Demand Druid Exhaust JobNextML CSV Reports

Last updated 1 year ago

Was this helpful?

On Demand Druid Exhaust service will generate CSV reports based on user request. As this is a generic data-product user can request a CSV report for selected columns using filters.

  1. Database Layer:

  • PostgreSQL Database (job_request): This is where job requests are stored. These requests include information about job configuration. This data appended to postgress by .

  • Druid: This is where flattened data is stored with the help ingestion specs and can retrieve data using Druid queries for specific datasorces using . Data provider

Database
Table/Datasouces

PostgreSQL

job_request

Druid

sl-project, sl-observation, sl-observation-status, sl-survey, ml-survey-status

  1. Data Processing Layer: Apache Spark is used to perform transformations, sort columns, eliminate duplicates, and replace unknown values with null. This process enhances data quality, organizes data logically before storing to CSV.

User Interaction Diagram

This interaction diagram details the complete process of requesting and generating reports. The user can request a specific report through SunbirdEd from the program dashboard. Using exhaust APIs, this will map the request to SunbirdObsrv. OnDemondDruidExhaust data-product will be triggered by a scheduled cron task, which will query postgress and druid to get data and process it using Spark to transform data and generate the report. The user receives the same report once it has been created.

Filter Format From UI
ml-analytics
Model Config