Denodo - Logical Data Warehouse (LDW)

This document will explain what the Logical Data Warehouse (LDW) is, how to get access to the LDW, and how to integrate LDW data into your unit's datasets.

The Logical Data Warehouse (LDW)

The LDW is a complete, virtual representation of the Enterprise Data Warehouse (EDW) here at the University of Illinois. Instead of having numerous direct connections to the EDW, which may impact performance, Decision Support decided to virtually replicate the entire EDW in Denodo as a Logical Data Warehouse. Note that individual connections to the EDW from Denodo will not be allowed.

The LDW contains every table and view in every schema within the EDW, while maintaining the structure and names, as individual Base Views. If an account has access to a particular table/view in the EDW, it will also have access to the corresponding base view in Denodo (see Access to the LDW section below).

The LDW is perfect for units that need to combine their own datasets with data from the EDW, and it allows units to combine this data without the need to move or store any of it. The appeal of a Logical Data Warehouse comes from the huge time and cost savings obtained in comparison to the “physical” approach, where all data needs to be previously copied to a single system. Physically replicating data means more hardware costs, more software licenses, more ETL flows to build and maintain, more data inconsistencies and more data governance costs. The logical approach leverages the existing systems while minimizing or removing the need for further replication. The Denodo optimizer uses a variety of techniques for this purpose including query delegation, advanced join optimization, and parallel query processing.

 

Access to the LDW

Access to the EDW will be mirrored in the LDW. The access a personal or application account has in the EDW will be the same access that account has in the LDW. Similarly, if an account needs access to data in the LDW, the account owner's Unit Security Contact (USC) will need to request access to that data in the EDW.

Accessing the LDW in Denodo

  1. Log into the Denodo Solution Manager.
  2. Choose the environment you want to work in and click Open in the Design Studio tile.
  3. On the left side of the screen, in the Explorer panel, you will see a virtual database (VDB) named logical_data_warehouse, click the plus icon to expand the folder.
  4. Expand the 01 - Connectivity folder and then expand the 02 - Base Views folder.
    1. This will display the list of data sets this account has access to.
    2. The EDW is one of these data sets, but there are Data Marts (dm_) and Data Collections (dc_) about specific subject areas that you may have access to as well.
  5. For example, expand the EDW folder.
    1. This will display the list of base views within the EDW data set this account has access to, in alphanumeric order.
  6. Opening a base view (vertical ellipse icon) will display the column names and field types, as well as allow users to execute Select * queries in the Execution Panel, and create integration views in the Create tab. Note that creating a derived view from an LDW base view will require that you change the database and folder on the Metadata tab.
  7. Note that the entire LDW is not editable by users. Create integration views of the LDW base views if you wish to edit them.

Integrating LDW data into My Datasets

As mentioned above, the real power and appeal of the LDW comes from the huge time and cost savings obtained in comparison to the “physical” approach, where all data needs to be previously copied to a single system. Physically replicating data means more hardware costs, more software licenses, more ETL flows to build and maintain, more data inconsistencies and more data governance costs. The logical approach leverages the existing systems while minimizing or removing the need for further replication.

Denodo offers users multiple ways to integrate data from two or more base views, including Unions, Joins, Selections, Flattened, Intersect, Minus, and Interface views. Below we will demonstrate how a simple Join view could be created between an LDW data set and a unit's own data set. For more information on how to create these types of integration views, see the Denodo - Creating Integrated Views KB article.

Example: Creating a Join integration view

In this example, we have data from a fictitious survey that was given to students in Biology courses at UIS during the Fall 2016 term and we want to join that with LDW data about the courses/sections to analyze the impact of instructional method, grading method, etc. on survey responses.

Integration views can be created from any view being used in it, so for this example, we can either start with our survey data view or the LDW view, T_SECT_BASE.

  1. Select a view to include in the Join integration view.
  2. Open the view by double clicking it in the Explorer panel, click Create and select Join. Or left-click the ellipse icon next to the view name in the Explorer panel, hover over New and select Join.
  3. On the Model (join) tab, drag the other view you would like to join from the Explorer panel into the white space.
  4. Drag the circle icon in the top right corner of one view to the circle icon in the top right of the other view to join them.
  5. In the panel that opens at the bottom of the screen, drag fields you want to join on from the left column to the corresponding fields in the right column.
    1. In this example, we will join termcode from the survey data with term_cd from the t_sect_base LDW table,
    2. coursesubj from the survey data with crs_subj_cd from the t_sect_base LDW table,
    3. coursenumber from the survey data with crs_nbr from the t_sect_base LDW table,
    4. and sectionnumber from the survey data with sect_nbr from the t_sect_base LDW table.
  6. On the Join Conditions tab, you can add additional Join conditions as well as adjust the Join Type, Join Method and Join Order.
  7. On the Where Conditions tab, you can add Where conditions to filter down the data being returned. In our example, since we know the term the survey data is from, we could add a where clause of Term_Cd=420168 to limit the number of rows returned from T_SECT_BASE and improve overall query performance.
  8. On the Group By tab, you can set the Group By fields and Having conditions.
  9. On the Output tab, you can select Order By fields, change the view name, and edit or remove fields from the output.
  10. On the Metadata tab, make sure the Database you're saving this view to is your unit's VDB and not the LDW. You can select the folder you want the view saved to and add a description, if needed.
  11. Click Save.
  12. Clicking Execution Panel and Executing the query will return the first 150 rows of the new integrated data set.


Keywords:
enterprise EDW 
Doc ID:
142930
Owned by:
Alex H. in UI Training and Development Resources
Created:
2024-10-09
Updated:
2024-11-13
Sites:
University of Illinois Training and Development Resources