Unfortunately there's not an automated mechanism for doing this. Cooling body suit inside another insulated suit. I like the term streaming instead of real-time, to avoid confusion with real-time computing. 468). As a developer or admin, from Quickstart Application, I want to deploy DHIB components to MarkLogic so that I can get started quickly and easily. DHFPROD-1726: "Update a Hub Project link" produces error, Update modules count after addition of 5.x modules, DHFPROD-1754 + other issues - develop branch, DHFPROD-1740, Create 5.x FlowManager and refactor 4.x FlowManager to LegacyFlowManager, DHFPROD-1710 - create space for the new dhf5 code (still rewriting to, DHFPROD-1751 - update develop snapshot version, Develop bug fixes related to LoadUserArtifactsCommand, Fix LoadUserArtifactsCommand and tests in 4.x, Loading to staging schemas db from database-specific directory in develop, Revert "DHFPROD-1428: Improve the usability of text input elements", DHFPROD-1428: Improve the usability of text input elements, E2e/no toaster wait -- comment out waiting for toaster after updating index, Updated tests to exclude those that are bound to fail in DHS, DHFPROD-1427 - Improve the usability of switch elements, Calling hubInstallModules in Installer class when running tests in DHS, Create 'LoadUserArtifactsCommand' for loading entities, mappings, Updating gradle-dhs.properties to run DHF core tests in DHS, Feature: Swagger powered mock api framework, Better handling of nested objects as properties when property is not defined as a formal entity, array, or scalar value, mlDeployDatabases ignores config files under entity-config, mlWatch doesn't load from src/main/ml-modules, certificate-templates and external-security config not being deployed from ml-config, DHF 4.0.0: mlDeployDatabases not deploying config from src/main/ml-config (same for mlDeploySecurity), DHF 4.0.0: mlDeploy fails (in some conditions) if project contains REST extension in ml-config, Modules location and deployment in DHF400, hubinit task should create a "stub" gradle-local.properties, Require workaround for deploying flexrep for data-hub-FINAL, if you call your mapping "mapping", it doesn't work (v4.1.0), If you call your input flow "input", it doesn't work (v4.1.0), If you call your harmonize flow "harmonize", it doesn't work (v4.1.0), Adding server namespaces in final-server.json breaks redeployment, mlLoadSchemas only loads to data-hub-staging-SCHEMAS. What is the purpose of overlapping windows in acoustic signal processing? 143 - Added validation for duplicate REST service extensions and transforms.
The reason to shift to a DHF 5 flow is if/when you find value in the OOTB steps - specifically mapping, matching, and merging. The requirements for this project were drawn from the developers who were building operational data hubs for customers. [WORKAROUND] DHF does not deploy REST extensions, Support for mlConfigPaths and mlModulePaths properties like ml-gradle has, Traces not capturing error message or stack, Revert Spring boot version upgrade for QS, Incorporate referenced entity model definitions in same definitions, Support for ES models in content creation, Move entity management logic from QS to core lib, Add triggers for entity model TDE generation, gradle-dhs.properties for DHS integration tests, GH #1652 If $type is undefined, don't nest, Send percentComplete as -1 in case of an error, DHFPROD-490 - added invalid character check for entity title, Passing more than one options in input flows using mlcp, DHFPROD-1526 - Beautify trace errors on QuickStart UI, 1580 Added DHF4 project with test cases for verifying the deployment , DHFPROD-1652 fixed broken links and other tweaks, HubAppDeployer no longer loses functionality in SimpleAppDeployer, Integrate mlui-integration branch into develop, MLUI-258: externaldef-dialog.component.ts, Added DHF4 project with test cases for verifying the deployment, Update Spring Batch example to version 1.4.0, Update writers to be batched vs individualized - delete/dupe, Quickstart Data Hub job status/error popup needs word wrap, Create gradle command to generate a TDE Template, Allow specifying flow options for harmonization flows run from quickstart, Migration guidance from 1.0 (8) to 2.0 (9), Loading documents through input flow is failing, gradlew quick-start:e2eLaunch could not find or load main class com.marklogic.quickstart.Application, README.md link to "Data Hub website" in "Advanced Hub Usage" is broken, Running input flow produces error "MISSING_CURRENT_TRACE" and the documents are not loaded, Missing dhf.sjs when creating a new flow (blocker), Upgrade npmVersion to 5.6.0 on build.gradle to avoid error on Windows, Tutorial link hard coded to old (2.0.3) release, quickstart harmonize flow view elides tab labels to meaninglessness, QuickStart Browse Data throws XDMP-LEXVAL, Documents are not shown under Browse Data STAGING database after running the input flow, Install screen on quickstart is broken, unable to install hub [blocker], Run undeploy tasks with configured mlManageUsername, mlDeploy fails when run by an LDAP user with full admin rights, QuickStart won't connect to HTTPS-enabled App Services, Search results on jobs page is showing the wrong results when searching for "input" jobs, Trace view is not displayed after clicking the trace link, One entity's indexes configurations clobber all the others', When modeling Order entity, needs to add element range index on "id" property, Quickstart tutorial doesn't have $version on content.sjs, but the screenshots have it, QuickStart harmonize flow settings not persisted during browser session, Primary key is not retained if it's clicked first when adding a property, Huge ID lists from a collector fail with FRAGTOOLARGE if Tracing is on, Old entity name is still retained on property entities type, run-flow rest extension is not setting a default job-id, Tutorial - Can not create "Harmonize Orders" flow, QuickStart Project Initialize does not recognize change to hostname, Object type changed after upgrade to Hub 2.0 so fields are missing or undefined, The mlUndeploy task does not completely remove Data Hub, QuickStart - Browse Data does not display content for certain URIs. Need to specify collation in query in trace-lib.xqy, fixing bug in restoring previous load options, Changes to fix JS errors in Swagger UI in master, File permission error running hadoop to do data load, Errors only flash on the GUI for a short time, Investigate MLCP UI for creating MLCP cmd line options, DataHub.installUserModules should be "syncUserModules", 192 - Removed automatic closing of notification, Handle duplicate REST service extensions and transforms.
Input flows are run as transforms.
Bugfixes, issues with truncation and mime types removal. A harmonize flow can break it up in separate documents and populate the envelope properties (URL, category, and so on). You signed in with another tab or window. [DHFPROD-2703] - Extra array brackets added when saving weights on the fly on mastering step, [DHFPROD-2704] - Add validation message on batch size and thread count, [DHFPROD-2705] - Typo on add target collections class, [DHFPROD-2714] - Metadata datahubCreatedByStep has a value of currentStep and all prev step names, [DHFPROD-2722] - Error in connecting to the Data Hub API when trying to GET Flows in Quickstart, [DHFPROD-2741] - makeEnvelope() should accept a Sequence for headers, [DHFPROD-2763] - DHF 5.0.1 Generated TDE templates include rows for external references, Flows and step definitions are not visible to non-admin users in ML 10 when DHF is installed by admin user, [DHFPROD-2776] - Provide text value on Target URI Preview for validation. This data can be written directly to the final content database. When exception is thrown, not all flow traces are persisted. Wed need to write this flow for each of the input sources, assuming that the property would be found in different places in the various sources, but this would require very little coding. (sjs|xqy), Next button not working when browsing to new hub project directory, Quickstart gets stuck in Loading with js error, scaffolded flow from empty ES model has errors, Need to properly escape the path for RegEx, Increase gradleVersion to 3.4 for the wrapper task, Example: Make a barebones example for cmd line ninjas, Ports 8010 and 8011 conflicting with Ops-Director, Example: Migrating a RDBMS to Data Hub using Spring Batch, Providing different source/dest DB for the hubRunFlow does not work, Job status stuck on STARTED for Input Flows, File not saving properly from quickstart to fix a bug, Error saving entity - collation not legal, collection name is hard coded in online store example, add (rest-extension-user,read) to XML documents in modules-db, column-width, or tooltip with full "Identifier" in traces table, Initializing DHF Project against existing DB is dropping indexes, Saving the changes in a flow code never finishes *sometimes*, Sometimes a trace for a failed Harmonize job is not available/not existing, Unable to ingest image (.png) documents using DHF Quick-Start application, Load files into the Data Hub schemas database, Cannot specify default permissions for data-hub-staging-MODULES db, Code deploy fails when An entity is deleted, when scaffolding code for an array, have it be empty list [] rather than null, gradlew is generated without execute permission, RunFlowTask using dyslexic string for hub key, QuickStart app doesn't work on Internet Explorer 11, Redeploy modules removes trace and debugging settings, Quickstart Application not working on Internet Explorer, Envelope instance created does not include "info", Error when creating a harmonize flow based on entity definition, Changes to $options not persisting if set in headers or triples in Harmonization flow, Input flow job (load-acme-tech) failing on 2.0.0-beta.1, consider using windows compatible line breaks, Replica forests not created from quick-start, Error in documentation for the REST transform, Harmonization hits maximum document size in collector output, Create a checklist for making DHF releases, Expose the ability to set the writer plugin's target database in gradle, Generated code template from Entity for nested item hides vars (v2/Entities), Control what runs in update mode to minimize locking, Uninstall doesn't always finish on the UI, Consider name and description for Data Hub and tooling, Entity properties starting with a capital generate templates with a preceding dash in var names, Expose the ability to pass custom properties via gradle, Entity view: show "Loading entities" rather than "You don't have any entities yet", Better default document format for input flow, ugly scrollbars appear on project list in quickstart, Default the harmonize collector to only get items in a standard input collection, Add ability to specify source/target database for a Harmonize flow, MLCP options: Output URI Replace is not working as expected, Illegal/unsupported escape sequence in Windows 10 when creating entities, Better feedback for client-side validation failures, XQuery bug detected but not shown on QuickStart GUI, Save Options in Input Flow doesn't save changes to 'Output URI Replace', Default forests are created/attached even with custom forest JSON definitions, When one item fails in a harmonize batch run, other items in the chunk do not get processed. Because we store the original content in the attachments element of the document envelope, the flow can extract the content from the original source, add the new property, then overwrite the existing document. However, you can keep your DHF 4 flows as-is and continue to run them. Some other process sends data to MarkLogic, using MLCP, the REST API, or one of the libraries built on top of the REST API, and an input flow transforms the data along the way. Support index configuration as a part of pushbutton deploy. Harmonize flows are a process in themselves. Thank you, as always! As an example, when the Documentation team publishes new content, the guides are part of a large zip file. My colleague at MarkLogic, Paxton Hare, started the MarkLogic Data Hub Framework project early in 2016. I stand corrected. Was Mister Kitson and/or the planet of Kitson based on/named after George Kitson? Fixed #62 - added spring batch to run jobs, added support for running MLCP using a dependency to MLCP jar instead. When first built, the common pattern of use was that input flows were used to bring data into the staging content database, then harmonize flows were used to turn raw data in the staging database into commonly-structured envelope documents in the final database. Added Price, but didn't see it in the Product entity, A trace is created with an invalid format, Hub (un)install time on windows is horrible, Add --disable-host-check to allow external access, Some broken links on docs-3.0 DHF Tutorial, Data Hub website links still refer to old marklogic-community address, One of the links to the Data Hub website on readme.md is broken, Clean up inconsistencies in content/instance in documentation, Error when trying to run mlDeploy from online store example on development branch, Code edited externally not updating on quickstart editor windows, problem with instance-json-from document when extracting array of string, Final content.sjs is out of date in tutorial, Add documentation for gradle task to uninstall data-hub, Clarify docs: REST resources can be added without being connected to an entity, Update Hub ES code to get inline with newer ES features, Invoking harmonize flow via post-commit trigger fails with non-admin user, Getting Started tutorial shows a stack trace for step 8, sub-step 6, Update Java Client API dependency to 4.0.4, Fix link to QuickStart .war file in tutorial, DHFPROD-646 remove link in setup, no content, DHFPROD-663 improve sample-data setup info, DHFPROD-493, DHFPROD-646 3.x documentation updates, DHFPROD-502 fix tutorial, primary key does not add element range index, First pass on 2.x flow upgrade to 3, and removing unneeded modules, Updating Issue #578: Update the deletion message, DHFPROD-675 add index confirm for save new entity, DHFPROD-496 crop terminal screenshot in tutorial, DHFPROD-496 make consistent with current tutorial, E2E/bug fixes -- tests for some bug fixes, DHFPROD-646 3.x documentation, Understanding, DHFPROD-664 adjust offset, size of new entities in UI, Rework of fix for issue#557where URI in request to /doc API.
The reason to shift to a DHF 5 flow is if/when you find value in the OOTB steps - specifically mapping, matching, and merging. The requirements for this project were drawn from the developers who were building operational data hubs for customers. [WORKAROUND] DHF does not deploy REST extensions, Support for mlConfigPaths and mlModulePaths properties like ml-gradle has, Traces not capturing error message or stack, Revert Spring boot version upgrade for QS, Incorporate referenced entity model definitions in same definitions, Support for ES models in content creation, Move entity management logic from QS to core lib, Add triggers for entity model TDE generation, gradle-dhs.properties for DHS integration tests, GH #1652 If $type is undefined, don't nest, Send percentComplete as -1 in case of an error, DHFPROD-490 - added invalid character check for entity title, Passing more than one options in input flows using mlcp, DHFPROD-1526 - Beautify trace errors on QuickStart UI, 1580 Added DHF4 project with test cases for verifying the deployment , DHFPROD-1652 fixed broken links and other tweaks, HubAppDeployer no longer loses functionality in SimpleAppDeployer, Integrate mlui-integration branch into develop, MLUI-258: externaldef-dialog.component.ts, Added DHF4 project with test cases for verifying the deployment, Update Spring Batch example to version 1.4.0, Update writers to be batched vs individualized - delete/dupe, Quickstart Data Hub job status/error popup needs word wrap, Create gradle command to generate a TDE Template, Allow specifying flow options for harmonization flows run from quickstart, Migration guidance from 1.0 (8) to 2.0 (9), Loading documents through input flow is failing, gradlew quick-start:e2eLaunch could not find or load main class com.marklogic.quickstart.Application, README.md link to "Data Hub website" in "Advanced Hub Usage" is broken, Running input flow produces error "MISSING_CURRENT_TRACE" and the documents are not loaded, Missing dhf.sjs when creating a new flow (blocker), Upgrade npmVersion to 5.6.0 on build.gradle to avoid error on Windows, Tutorial link hard coded to old (2.0.3) release, quickstart harmonize flow view elides tab labels to meaninglessness, QuickStart Browse Data throws XDMP-LEXVAL, Documents are not shown under Browse Data STAGING database after running the input flow, Install screen on quickstart is broken, unable to install hub [blocker], Run undeploy tasks with configured mlManageUsername, mlDeploy fails when run by an LDAP user with full admin rights, QuickStart won't connect to HTTPS-enabled App Services, Search results on jobs page is showing the wrong results when searching for "input" jobs, Trace view is not displayed after clicking the trace link, One entity's indexes configurations clobber all the others', When modeling Order entity, needs to add element range index on "id" property, Quickstart tutorial doesn't have $version on content.sjs, but the screenshots have it, QuickStart harmonize flow settings not persisted during browser session, Primary key is not retained if it's clicked first when adding a property, Huge ID lists from a collector fail with FRAGTOOLARGE if Tracing is on, Old entity name is still retained on property entities type, run-flow rest extension is not setting a default job-id, Tutorial - Can not create "Harmonize Orders" flow, QuickStart Project Initialize does not recognize change to hostname, Object type changed after upgrade to Hub 2.0 so fields are missing or undefined, The mlUndeploy task does not completely remove Data Hub, QuickStart - Browse Data does not display content for certain URIs. Need to specify collation in query in trace-lib.xqy, fixing bug in restoring previous load options, Changes to fix JS errors in Swagger UI in master, File permission error running hadoop to do data load, Errors only flash on the GUI for a short time, Investigate MLCP UI for creating MLCP cmd line options, DataHub.installUserModules should be "syncUserModules", 192 - Removed automatic closing of notification, Handle duplicate REST service extensions and transforms.
Input flows are run as transforms.
Bugfixes, issues with truncation and mime types removal. A harmonize flow can break it up in separate documents and populate the envelope properties (URL, category, and so on). You signed in with another tab or window. [DHFPROD-2703] - Extra array brackets added when saving weights on the fly on mastering step, [DHFPROD-2704] - Add validation message on batch size and thread count, [DHFPROD-2705] - Typo on add target collections class, [DHFPROD-2714] - Metadata datahubCreatedByStep has a value of currentStep and all prev step names, [DHFPROD-2722] - Error in connecting to the Data Hub API when trying to GET Flows in Quickstart, [DHFPROD-2741] - makeEnvelope() should accept a Sequence for headers, [DHFPROD-2763] - DHF 5.0.1 Generated TDE templates include rows for external references, Flows and step definitions are not visible to non-admin users in ML 10 when DHF is installed by admin user, [DHFPROD-2776] - Provide text value on Target URI Preview for validation. This data can be written directly to the final content database. When exception is thrown, not all flow traces are persisted. Wed need to write this flow for each of the input sources, assuming that the property would be found in different places in the various sources, but this would require very little coding. (sjs|xqy), Next button not working when browsing to new hub project directory, Quickstart gets stuck in Loading with js error, scaffolded flow from empty ES model has errors, Need to properly escape the path for RegEx, Increase gradleVersion to 3.4 for the wrapper task, Example: Make a barebones example for cmd line ninjas, Ports 8010 and 8011 conflicting with Ops-Director, Example: Migrating a RDBMS to Data Hub using Spring Batch, Providing different source/dest DB for the hubRunFlow does not work, Job status stuck on STARTED for Input Flows, File not saving properly from quickstart to fix a bug, Error saving entity - collation not legal, collection name is hard coded in online store example, add (rest-extension-user,read) to XML documents in modules-db, column-width, or tooltip with full "Identifier" in traces table, Initializing DHF Project against existing DB is dropping indexes, Saving the changes in a flow code never finishes *sometimes*, Sometimes a trace for a failed Harmonize job is not available/not existing, Unable to ingest image (.png) documents using DHF Quick-Start application, Load files into the Data Hub schemas database, Cannot specify default permissions for data-hub-staging-MODULES db, Code deploy fails when An entity is deleted, when scaffolding code for an array, have it be empty list [] rather than null, gradlew is generated without execute permission, RunFlowTask using dyslexic string for hub key, QuickStart app doesn't work on Internet Explorer 11, Redeploy modules removes trace and debugging settings, Quickstart Application not working on Internet Explorer, Envelope instance created does not include "info", Error when creating a harmonize flow based on entity definition, Changes to $options not persisting if set in headers or triples in Harmonization flow, Input flow job (load-acme-tech) failing on 2.0.0-beta.1, consider using windows compatible line breaks, Replica forests not created from quick-start, Error in documentation for the REST transform, Harmonization hits maximum document size in collector output, Create a checklist for making DHF releases, Expose the ability to set the writer plugin's target database in gradle, Generated code template from Entity for nested item hides vars (v2/Entities), Control what runs in update mode to minimize locking, Uninstall doesn't always finish on the UI, Consider name and description for Data Hub and tooling, Entity properties starting with a capital generate templates with a preceding dash in var names, Expose the ability to pass custom properties via gradle, Entity view: show "Loading entities" rather than "You don't have any entities yet", Better default document format for input flow, ugly scrollbars appear on project list in quickstart, Default the harmonize collector to only get items in a standard input collection, Add ability to specify source/target database for a Harmonize flow, MLCP options: Output URI Replace is not working as expected, Illegal/unsupported escape sequence in Windows 10 when creating entities, Better feedback for client-side validation failures, XQuery bug detected but not shown on QuickStart GUI, Save Options in Input Flow doesn't save changes to 'Output URI Replace', Default forests are created/attached even with custom forest JSON definitions, When one item fails in a harmonize batch run, other items in the chunk do not get processed. Because we store the original content in the attachments element of the document envelope, the flow can extract the content from the original source, add the new property, then overwrite the existing document. However, you can keep your DHF 4 flows as-is and continue to run them. Some other process sends data to MarkLogic, using MLCP, the REST API, or one of the libraries built on top of the REST API, and an input flow transforms the data along the way. Support index configuration as a part of pushbutton deploy. Harmonize flows are a process in themselves. Thank you, as always! As an example, when the Documentation team publishes new content, the guides are part of a large zip file. My colleague at MarkLogic, Paxton Hare, started the MarkLogic Data Hub Framework project early in 2016. I stand corrected. Was Mister Kitson and/or the planet of Kitson based on/named after George Kitson? Fixed #62 - added spring batch to run jobs, added support for running MLCP using a dependency to MLCP jar instead. When first built, the common pattern of use was that input flows were used to bring data into the staging content database, then harmonize flows were used to turn raw data in the staging database into commonly-structured envelope documents in the final database. Added Price, but didn't see it in the Product entity, A trace is created with an invalid format, Hub (un)install time on windows is horrible, Add --disable-host-check to allow external access, Some broken links on docs-3.0 DHF Tutorial, Data Hub website links still refer to old marklogic-community address, One of the links to the Data Hub website on readme.md is broken, Clean up inconsistencies in content/instance in documentation, Error when trying to run mlDeploy from online store example on development branch, Code edited externally not updating on quickstart editor windows, problem with instance-json-from document when extracting array of string, Final content.sjs is out of date in tutorial, Add documentation for gradle task to uninstall data-hub, Clarify docs: REST resources can be added without being connected to an entity, Update Hub ES code to get inline with newer ES features, Invoking harmonize flow via post-commit trigger fails with non-admin user, Getting Started tutorial shows a stack trace for step 8, sub-step 6, Update Java Client API dependency to 4.0.4, Fix link to QuickStart .war file in tutorial, DHFPROD-646 remove link in setup, no content, DHFPROD-663 improve sample-data setup info, DHFPROD-493, DHFPROD-646 3.x documentation updates, DHFPROD-502 fix tutorial, primary key does not add element range index, First pass on 2.x flow upgrade to 3, and removing unneeded modules, Updating Issue #578: Update the deletion message, DHFPROD-675 add index confirm for save new entity, DHFPROD-496 crop terminal screenshot in tutorial, DHFPROD-496 make consistent with current tutorial, E2E/bug fixes -- tests for some bug fixes, DHFPROD-646 3.x documentation, Understanding, DHFPROD-664 adjust offset, size of new entities in UI, Rework of fix for issue#557where URI in request to /doc API.