Some documents might not be inserted. For example: Failover does not succeed within 5 minutes. If it takes more than 5 minutes for MarkLogic to recover from the failure, then mlcp aborts the job and reports an error. A failure of some kind occurs, such as host going down. The exact error messages will depend on the type of failure. Notice that example errors below include a retryable exception. WARN mapreduce. ContentWriter: Batch ERROR mapreduce. Errors may continue to occur because MarkLogic is still failing over.
INFO mapreduce. Transactions automatically commit and rollback. Between each retry, it sleeps for a certain amount of time. The interval varies from 0. In most cases, a successful retry will not cause any insertions to fail.
All the documents in the current transaction will fail permanently. WARN contentpump. TransformWriter: Batch DEBUG mapreduce. DEBUG contentpump. ContentWriter: com. ERROR contentpump. Loading temporal documents may have issues. When mlcp commit fails and catches exceptions, it tries rolling back before retry loading the whole batch.
This may create issues for temporal documents, since they may be inserted multiple times. This is to prevent the client-side from running out of memory as the DHS cluster may have a huge number of nodes. ThreadManager: Thread pool will auto-scale based on available server threads.
ThreadManager: Running with MultithreadedMapper. Initial thread count for split 2: 10 INFO contentpump. ThreadManager: Thread pool is scaling-in. ThreadManager: New available server threads: Import Command Line Options This section summarizes the command line options available with the mlcp import command. The following command line options define your connection to MarkLogic: Option Description -host comma-list Required. A comma separated list of hosts through which mlcp can connect to the destination MarkLogic Server.
You must specify at least one host. For more details, see How mlcp Uses the Host List. Default: Required, unless using Kerberos authentication. Default: The first child element under the root element. Default: No namespace. When splitting an aggregate input file into multiple documents, the element or attribute name within the document root to use as the document URI. Default: In local mode, hashcode-seqnum , where the hashcode is derived from the split number; in distribute mode, taskid-seqnum.
Maximum: This option can be combined with other filter options. Default: Import all documents. The option value must be a character set name accepted by your JVM; see java. Default: UTF Set to system to use the platform default encoding for the host on which mlcp runs. Default: true. The option value must be a comma separated list of name , datatype pairs, such as a,number,b,boolean.
Default: All fields have string type. Default: The database associated with the destination App Server identified by -host and -port. Default: comma ,. Default: root. Accepted values: mixed documents only , xml , json , text , binary. See Time vs. Useful when splitting an input file into multiple documents. If this is false and the archive contains no metadata, an error occurs.
Accepted values: zip , gzip. Default: Load all files. Default: documents. Default: The maximum Long value Long. This command line option is optional. Default: 0.
Accepted values: local. Accepted values: filesystem or a modules database name. Default: The modules database associated with the App Server. Default: The modules root configured for the App Server. If you also use -modules , then this path specifies the modules root for that modules database.
If you use an options file, this option must appear first. Loaded documents are added to these collections.
If the directory exists, its contents are removed prior to ingesting new documents. Using this option enables -fastload by default, which can cause duplicate URIs to be created. For quad data, specifies the default graph for quads that do not include an explicit graph label. For other triple formats, specifies the graph into which to load all triples.
For details, see Loading Triples. The graph into which to load all triples. For quads, overrides any graph label in the quads. Default: The default permissions associated with the user inserting the document.
Used to construct output document URIs. The replacement strings must be enclosed in single quotes. Default: 1. Default: false for local mode. Do not use this option with multi-byte character data. You must include this option if you use the -ssl option to connect to an App Server configured to disable MarkLogic's default protocol TLSv1. Allowed values: tls , tlsv1 , tlsv1. Default: TLSv1. Can be passed along with the existing -ssl option. Default: transform. This option is required to enable a custom transformation.
Default: no namespace. Accepted values: default , full , none. In XQuery 0. Next chapter ». Serialized RDF triples, in one of several formats. The value in the column used as the id.
For a record of the form first,second,third where Column 1 is the id: first. Add a. Replace the unmapped suffix with. For quads that contain an explicit graph IRI, load the triple into that graph. Graphs: aa for all triples All triples also added to collections bb and cc. Graph: aa All triples also added to collections bb and cc. Data about the original input document.
Additional context information about the insertion, such as tranformation-specific parameter values. A sequence of sec:permission elements, each representing a capability and a role id. An integer value or a string that can be converted to an integer. An array of permission objects, each containing a capability and a roleId property. An object where each property represents a key-value metadata item. Port number of the destination MarkLogic Server. MarkLogic Server user with which to import documents.
Password for the MarkLogic Server user specified with -username. When splitting an aggregate input file into multiple documents, the name of the element to use as the output document root. The number of documents to process in a single request to MarkLogic Server. A comma-separated list of collection URIs. When importing documents from an archive, whether to copy document collections from the source archive to the destination.
When importing documents from an archive, whether to copy document key-value metadata from the source archive to the destination. When importing documents from an archive, whether to copy document permissions from the source archive to the destination. When importing documents from an archive, whether to copy document properties from the source archive to the destination.
When importing documents from an archive, whether to copy document quality from the source archive to the destination. The name of the destination database. A comma-separated list of database directory names. Whether or not to force optimal performance, even at the risk of creating duplicate document URIs.
Add each loaded document to a collection corresponding to the name of the input file. When importing documents from a database archive, whether or not to ignore missing metadata files. A regular expression describing the filesystem location s to use for input. The input file type. When importing from files, the maximum number of bytes in one input split. The maximum number of threads that run mlcp. The maximum percentage integer between 0 and of available server threads used by mlcp for import jobs.
When importing from files, the minimum number of bytes in one input split. Specify the name of the modules database to use when applying a server-side transformation. The modules root path to use when applying a server-side transformation. The default namespace for all XML documents created during loading.
Specify an options file pathname from which to read additional command line options. Whether or not to delete all content in the output database directory prior to loading. A comma separated list of collection URIs. The destination database directory in which to create the loaded documents. The name of the database partition in which to create documents. A comma separated list of role,capability pairs to apply to loaded documents. Specify a prefix to prepend to the default URI.
A comma separated list of regex,string pairs that define string replacements to apply to the URIs of documents added to the database. The initial delay in minutes before mlcp starts sending polling request to check the available server threads. The time interval in minutes mlcp sends polling request to check the current available server threads.
Restrict mlcp to connect to MarkLogic only through the hosts listed in the -host option. Whether or not to divide input data into logical chunks to support more concurrency. Subject to the terms and conditions of this Agreement, MarkLogic grants to You a limited, non-transferable, non-exclusive, internal use license to install, access, start and use MarkLogic software in binary executable form "Software" , related documentation "Documentation" and, if you are a Faculty Member or Student, training manuals and materials provided to you by MarkLogic for academic or training purposes "Training Materials".
The Software, Documentation and Training Materials are sometimes collectively referred to herein as "Product". The license granted in this Section 1 a shall be solely for the permitted use as described herein "Developer License" and as used in conjunction with the license key or keys if and to the extent provided to You by MarkLogic "Keys". Other than the uses permitted in this Agreement or under another agreement between the parties, this Developer License does not grant to You any title, right, interest or license in and to the Product or any support, modifications, enhancements, new releases or updates to it.
For license verification purposes, You acknowledge that the Software may electronically transmit to MarkLogic summary data relating to use of the Software including, but not limited to, the host IDs, Keys and the Capacity of the Databases. MarkLogic takes privacy very seriously. All rights, title and interests including, but not limited to, copyright, trade secret and other intellectual property right in and to the content accessed through the Product are the property of the applicable content owner and may be protected by applicable copyright or other law.
This Agreement grants no rights to such content, and MarkLogic disclaims any responsibility arising out of or related to Your access and use of the content and intellectual property thereto furthered by use of the Product. TERM The term of the license granted herein is limited to the fixed period specified in the administrative interface of the Software, or such other period as approved by MarkLogic in writing "License Term". You agree that You are not a citizen, national, or resident of, and are not under control of, the government of Cuba, Iran, Republic of Sudan, North Korea, Syria, Crimea, nor any country to which the United States has prohibited export.
In the above example, the fn:data call is wrapped in xdmp:describe to more accurately represent the in-memory type. If you omit the xdmp:describe wrapper, serialization of the value for display purposes can obscure the type.
Create, read, update and delete JSON documents using the same functions you use for other document types, including the following builtin functions:. A node to be inserted into an object node must have a name. A node to be inserted in an array node can be unnamed. Use xdmp:unquote to convert serialized JSON into a node for insertion into the database.
For example:. You can also use the mlcp command line tool for loading JSON documents into the database. The table below provides examples of updating JSON documents using xdmp:node-replace , xdmp:node-insert , xdmp:node-insert-before , and xdmp:node-insert-after. The table below contains several examples of updating a JSON document. Notice that when inserting one object into another, you must pass the named object node to the node operation.
For example, assuming fn:doc "my. This section covers the following search related topics:. For details, see the following references:. A name-value pair in a JSON document is called a property. Constructors for JSON index references are also available, such as cts:json-property-reference.
For details, see the following:. The proper form is chosen based on the parent node and the calling language. Otherwise, a query is serialized based on the calling language. If the value of a JSON query property is an array and the array is empty, the property is omitted from the serialized query.
If the value of a property is an array containing only one item, it is still serialized as an array. You must use the NodeBuilder interface to construct text, number, boolean, and null nodes.
To make changes to a JSON document whose root node is a JSON object node or array node, convert the immutable document node into its mutable JavaScript representation using the following technique. The following example applies the toObject technique to a document with an object node root. The example inserts, updates, and deletes JSON properties on a mutable object, and then updates the original document using xdmp. The example uses xdmp. You can use this technique even when the root node of the document is not an object node.
The following example applies the same toObject technique to update a document with an array node as its root. If you attempt to modify a JSON document node without converting it to its mutable JavaScript representation using toObject , you will get an error. For example, the following code would produce an error because it attempts to change the value of a property named a on the immutable document node:. That is, use the toObject method to first convert the document node into its logical native JavaScript representation so that you can manipulate it in a natural way.
This technique applies even if the root node of the document is not an object node. For example, the following code retrieves the first item from a JSON document whose root node is an array node:. The following example uses a JSON document whose root node is a number node:. If you cannot read the entire document into memory for some reason, you can also access its contents through the document node root property.
You can only use the insert and replace functions in contexts in which you can construct a suitable node to insert or replace.
For example, inserting or updating array items, or updating the value of an existing JSON property. You cannot construct a node that represents just a JSON property, so you cannot use xdmp. To replace the value of an array node, you must address the array node, not one of the array items. For example, use a path expression with an array-node or node expression in its leaf step.
For more details, see Selecting Arrays and Array Members. Keep the following points in mind when passing new or replacement nodes into the update functions. The following examples illustrate using the node update functions on JSON documents.
This section describes how to perform these conversions and includes the following parts:. This section describes how to use the XQuery library and includes the following parts:. To understand how the JSON conversion features in MarkLogic work, it is useful to understand the following goals that MarkLogic considered when designing the conversion:. Because of these goals, the defaults are set up to make conversion both fast and easy. Custom conversion is possible, but will take a little more effort.
A strategy is a piece of configuration that tells the JSON conversion library how you want the conversion to behave. To use any strategy except the basic strategy, you can set and check the configuration options using the following functions:. For the custom strategy, you can tailer the conversion to your requirements.
The default mode is Auto. In Auto mode, your query results are formatted for readability based on the query and the output type. In Auto mode, you can override the default rendering using the format dropdown at the far right of the results pane. For example, strings are rendered as text by default, but if you know the string contains serialized JSON, you can change the rendering to JSON to get syntax highlighting and tree controls.
The choices on the format dropdown depend on the type of data returned by your query. Raw mode always displays plain text, but it is not necessarily the query results exactly as returned from MarkLogic Server. Slight formatting changes are still applied to improve readability. Each time you modify a query and evaluate it, Query Console saves the contents and time of execution in the Query History.
Query Console maintains a separate history for each query. Query Console adds a history entry for each unique version of a query. If the query text is unchanged between runs or if the changes create a duplicate of an existing history entry for the query, Query Console does not create a new entry. To remove a history entry, click the delete X button in the upper right corner of the entry.
To close the history dropdown, click on the Query History dropdown again, or simply move the mouse outside the dropdown. For XQuery, Query Console profiles your query as if you passed your query to prof:invoke, and then displays a performance report in the results pane.
Profiling must be enabled on an App Server before you can profile a query. It is enabled by default when you create an App Server. In XQuery, your Query Console query appears as the. In JavaScript, your query appears as program. When profiling a JavaScript query, you can click on the download icon in the upper right of the profiling report to save your profiling data in format that can be imported into the Profiles tab of the Chrome browser developer tools.
For details on profiling queries and the meaning of the profile report columns, see Profiling Requests to Evaluate Performance in Query Performance and Tuning Guide. Two types of query plan are available: the estimated plan and the actual plan. This query plan tree is then rearranged by MarkLogic's query optimizer to produce a more efficient query execution strategy. Viewing a query plan shows a graphical representation of the operators used, arranged in a tree structure that shows the flow of information from the leaves data accessors , via operators joins, filters, etc.
Each query plan operator displays it's name, and a table of pertinent information about the operator. Further information is available in tooltips by hovering over either the title, costs, or table display areas of the operator information box. It's beyond the scope of this walkthrough to detail everything about the information shown by the query plan viewer.
Further information can be found by watching the following videos:. In MarkLogic The estimated plan can be used to understand how the query will be executed, and how the query optimizer has assigned costs and chosen that query plan.
The estimated plan is available before the query has been executed. It contains a graphical representation of information on the costs assigned to each query plan operator by the query optimizer.
This representation uses a row of red shaded shapes below the operator title - each shape is a different cost metric, and the deeper the shade of colour the higher the value of the metric is compared to the maximum value for that metric across all operators in the plan.
To view the estimated plan, switch to the tab in the results pane. The query plan display is not updated automatically, so the "Generate" button needs to be clicked to update the display if the query has changed since the plan was last viewed. The actual plan can be used to understand how a query executed, and where memory and time were spent during that query execution. The actual plan is only available after the query has been executed. It is only produced if the check box on the "Actual Plan" tab is activated, and the instrumentation used to create it reduces the performance of the query executed.
The actual plan contains a graphical representation of information for each operator that is measured during the query execution, including row count, execution time, and peak memory usage. This representation uses a row of red shaded shapes below the operator title - each shape is a different execution metric, and the deeper the shade of colour the higher the value of the metric is compared to the maximum value for that metric across all operators in the plan.
To view the actual plan, slide the check box on the "Actual Plan" tab to on, and run the query using the run button. After this, switching to the "Actual Plan" tab will display the query plan annotated with runtime information.
Use the Explorer feature to browse the contents of a database. This topic explains how to use the Explorer to explore the contents of a database. You can also use the Explorer to modify database contents; for details, see Editing Database Content. For each document in the database, the summary includes the document URI, the type and name of the root node, a link to the document properties, and a link to any collections to which the document belongs.
Use the search box of the Explorer to explore only a specific document or the documents that match a wildcard expression. The URI lexicon must be enabled on the database you are exploring. You will get an error if the URI lexicon is not enabled. The following table contains some examples of filtering expressions:.
You can use the Explorer to modify your database content without writing code. See the following topics for details:. You can only insert or update content. You cannot modify metadata such as permissions and properties. You can insert a new document into a database from the Explorer.
0コメント