Denodo - Creating Data Sources from Box

Box.com is a platform used at the University of Illinois for storing and sharing documents, which makes it a good places to store documents that you want to integrate with Denodo. However, there are some nuances to be aware of in order to create stable integrations. This document is intended to establish best practices for creating stable integrations between Denodo and Box.

Box integrations use Box “applications” which are authenticated with the OAuth 2.0 specification. OAuth 2.0 is a token-based authentication system. This document will provide instructions for the creation of the Box application and the configuration of OAuth 2.0.

Files

Best Practice: Use CSV files rather than Excel files. CSV files tend to be more stable in Denodo in general and we have also noticed that Excel files accessed over HTTP paths have limitations. Excel limitations can be avoided by converting them to CSV before integrating them with Denodo.

Best Practice: Use the same basic format for all your CSV files. Not only is consistency a good idea, but it will make the Box integration more stable as well. In this document we will standardize on using the comma ( , ) character for the column delimiter, \n for the end of line delimiter, and the first row will have column headers.

Each file in Box is given a unique ID. When you click on a file in box the number just to the right of “file/” in the URL is the file ID.

Example: https://app.box.com/file/1471507741297. The file ID is 1471507741297. You will need to know the IDs of the files you want to integrate in Denodo.

Creating Box Custom Applications

The first step is to create an "application" in the Box Dev Console. This creates the account that will be used to authenticate into Box.

Screenshot of the Custom App tile

  • Enter a name for the app according to your naming standards in the App Name field.
  • Enter a concise Description.
  • For Purpose, select Integration.
  • For Categories, select Collaboration.
  • For Which external system are you integrating with?, enter “Denodo” into the field.
  • For Who is building this application?, select Customer.
  • Click Next.

Screenshot of the Create a Custom App window with all the fields filled in

  • Select User Authentication (OAuth 2.0).
  • Click Create App.

Screenshot of Step 2 showing User Authentication (OAuth 2.0) selected

  • Select the Configuration tab.
  • Scroll down to the OAuth 2.0 Credentials section.
  • Copy and Paste the Client ID and Client Secret to a convenient, easily accessible location, such as Keepass.
    1. NOTE: The developer token is not necessary for our use. However, you will need the Client ID and Client Secret for the data source configuration in Denodo.

OAuth 2.0 Credentials section showing the Client ID and CLient Secret fields

  • In the OAuth 2.0 Redirect URIs section, add the OAuth 2.0 Redirect URI of the environment you'll be using with this app and click Add. The URIs for the different environments in Denodo are listed below:

Production: https://datahub.uillinois.edu/oauth/2.0/redirectURL.jsp

QA: https://qa.datahub.uillinois.edu:9443/oauth/2.0/redirectURL.jsp

Dev: https://dev.datahub.uillinois.edu:9443/oauth/2.0/redirectURL.jsp

Screenshot of the Redirect URIs section showing the Dev Redirect URI being added

  • In the Application Scopes section, make sure the check boxes are checked for both Read all files and folders stored in Box and Write all files and folders stored in Box.

Screenshot showing the Read... and Write... options are both checked

  • Click the Save Changes button in the top right.
  • Click the My Apps option on the left will bring you back to your list of application.
  • Click the Create New App button in the top right and create 2 more Box applications - one for the QA environment and one for the Production environment,
  • After creating these three applications, your list of apps should include three for Denodo.

Screenshot of the My Apps screen showing the three applications created for Denodo

 

OAuth 2.0 Tokens

This document will not cover all the details of the OAuth 2.0 specification, but it is helpful to know a little bit about what is happening so the next steps make a little more sense.

In the above steps we created a Box Custom Application which generated a Client ID and Client Secret. You can think of this Client ID and secret like an application service account, with a password. However, the Client ID and Secret are not sufficient to fully authenticate into Box and access data. Instead, the client application, Denodo in this case, will request an access token from the Box OAuth server. When Denodo asks for an access token the OAuth server returns both an access token and a refresh token. This access token is used to make requests for data within Box. However, the access token is only good for about an hour. After that time the client application (Denodo) will need to use the refresh token to request a new access token from the OAuth server. The OAuth server returns a new access token and a new refresh token. This new refresh token must be used to make the next request for a new access token.

Denodo will perform the actions of using refresh tokens to get new access tokens, but there are things that cause this process to become unstable. The suggestions in this document are meant to help avoid those issues.

Create a Date Source in Denodo

For this document we will use CSV files. In this example we have 3 CSV files of course data, instructor salary data, and awarded grant data.

Best Practice: Only use the Box App credentials in a single Denodo connection. Creating more than one Denodo connection that re-uses the same Box App Client ID will cause the access tokens associated with that Client ID to interrupt one another, and your integration will become unstable.

Since we do not want to use the same Box app credentials for more than one connection, it is a good idea to make the data source reusable. The below instructions show how to make a reusable CSV connection to Box.

  • In Denodo Design Studio, click the context menu for 01 - Data Sources and select New > Data source > Delimited file.

Screenshot showing the path to create a new delimited data source

  • Give the data source a name.
  • For Data route, select HTTP Client.
  • Leave the HTTP method set to GET, and add the following URL to the Base URL field.

https://api.box.com/2.0/files/@FileID/content

The link above uses an interpolation variable for the File ID, which is what will make this connection reusable.

Screenshot showing the top section of the new delimited data source's config page

  • Expand the Authentication section, and select OAuth 2.0 from the Authorization options.
  • Leave the Authorization Grant value set to Authorization code grant, and paste the Client identifier and Client secret from your Box app in their respective fields.

Screenshot showing the Authentication section fields filled in

  • Click the link titled Launch the OAuth 2.0 credentials wizard to help you obtain these credentials, which should now be enabled. If the link is not enabled, go to the Tools tab at the top right and select OAuth credentials wizards > OAuth 2.0 wizard.

Screenshot of the link

  • Populate the Token endpoint URL and the Authorization server URL fields with the following values.

Token endpoint URL: https://account.box.com/api/oauth2/token

Authorization server URL: https://account.box.com/api/oauth2/authorize

  • For Redirect URI, select the Other radio button and populate that field with one of the following redirect URIs, based on which environment you're working with.

Production: https://datahub.uillinois.edu/oauth/2.0/redirectURL.jsp

QA: https://qa.datahub.uillinois.edu:9443/oauth/2.0/redirectURL.jsp

Dev: https://dev.datahub.uillinois.edu:9443/oauth/2.0/redirectURL.jsp

Screenshot showing the token, authorization and redirect URLs filled in

  • Expand 2. Generate the Authorization URL, click the Generate link to populate the Authorization URL field, and then click the Open URL link.

Screenshot showing the 2nd section of the OAuth 2.0 credentials wizard fields being filled in

  • If a web page opens asking you to Grant access to Box, click on that blue button to grant access.
  • Copy the URL on line 3 of the web page that opens.

Screenshot of the web page that opens displaying the authorization response URL

  • Expand 3. Paste the authorization response URL and paste the link you just copied.
  • Expand 4. Obtain the OAuth 2.0 credentials and click the OAuth 2.0 credentials link.

Screenshot showing the 3rd and 4th sections of the OAuth 2.0 credentials wizard

  • You should receive the following confirmation that the OAuth 2.0 credentials were obtained.

Confirmation that the OAuth 2.0 credentials have been successfully obtained

  • Click the Copy credentials to the clipboard link, and then click OK.
  • Scroll down to the Column delimiter field and add a comma ( , ), and check the box next to Header.

Screenshot showing the delimiter and header options

  • Click Save.
  • Once the data source is saved, we can click Test Connection, which will bring up an Interpolation variable screen where we'll need to enter a File ID for the first CSV file in Box. See the Files section above to learn how to obtain these File IDs. Click OK.

Screenshot of the Interpolation variable screen with a value entered

Best practice: Once your connection is initially tested, avoid using the “Test Connection” feature. We have found that doing so could introduce an issue with the authentication.

Create Base Views

In this document we will create 3 base view to illustrate how to reuse the box data source for multiple CSV files.

  • Open the Data Source created above and click the Create Base View link.

Screenshot of the Create base view link

  • You will be prompted for a FILEID. Enter the File ID of the first CSV file and click Ok.

Screenshot of the Interpolation variable screen with a value entered

  • In the context menu for the base view we just created, select Rename and give the base view a new name.
  • Drag the base view from the 01 - Connectivity / 01 - Data Sources folder to the 01 - Connectivity / 02 - Base Views folder.
  • Repeat the steps above to create base views for the other two CSV files. When complete, you should have one data source and three base views.

Screenshot showing 1 data source and 3 base views

Create Derived Views that Encode the File IDs in the View

There is one aspect of this solution to make the data source reusable that is a little annoying. If we were to execute these base views, we would be prompted for the File ID each time. If we use the wrong File ID, our query may not return any data. We can take care of that by creating integrated views of these base views.

There are a few ways to accomplish creating these integrated views.

  1. Using the GUI
    • In the context menu for the base view, select New > Selection.
    • On the Where Conditions tab, enter fileid = 1652491525507
  2. Using a VQL script
    • Run the following script in the VQL Shell to create the integrated view and have them placed in the 02 - Integration folder. Note: your integrated view name, folder names, base view name and fileid will be unique to your data source and VDB.

CREATE OR REPLACE VIEW iv_box_csvs_course_schedule_data FOLDER = ' / ahall1 / 02 - Integration'

AS

select * from bv_box_csvs_course_schedule_data where fileid = '1652491525507';

Now, we can query these integrated views directly, without having to enter a File ID each time.

Note: You may need to click the File tab at the top and select Refresh to see the new integrated views you just created.

Screenshot showing 3 integrated views



KeywordsDatahub   Doc ID136730
OwnerLearning Systems SupportGroupUI Training and Development Resources
Created2024-04-10 13:56:31Updated2024-10-23 08:27:59
SitesUniversity of Illinois Training and Development Resources
Feedback  0   0