Skip to content

Conversation

@fxprunayre
Copy link
Member

@fxprunayre fxprunayre commented Nov 17, 2020

Index field types

In the index, JSON document representing the metadata is stored.

Simple fields are stored like a map of key = value. eg.

"_source": {
  "docType": "metadata",
  "documentStandard": "iso19115-3.2018",
  "document": "",
  "metadataIdentifier": "7451e2bd-22e8-4a74-a999-01c58b630369",
  "standardName": "ISO 19115",
  "indexingDate": "2020-11-19T06:30:47.009Z",
  "dateStamp": "2020-11-17T13:56:43",
  "mainLanguage": "fre"

Multilingual text field

Multilingual fields are stored using an object with the following properties:

  • default the record default language value (this property is updated on the client side based on the UI language)
  • lang{langCode} one or more properties containing all record languages values
"_source": {
  "resourceTitleObject": {
    "default": "Aménagements routiers et autoroutiers - Série",
    "langfre": "Aménagements routiers et autoroutiers - Série",
    "langger": "Straßen- und Autobahnverbesserungen - Standard"
},

Codelist

Codelists are stored as an object like titles, abstract, ...

If the record is multilingual, codelist translations are stored in the index for the record languages:

"cl_spatialRepresentationType" : {
    "key": "grid",
    "default": "Grid",
    "langeng": "Grid",
    "langfre": "Grid",
    "text": "{{> inner text of the codelist element. Used in some profiles eg. ISO HNAP}}",
    "link": "./resources/codeList.xml#MD_SpatialRepresentationTypeCode",
  }

When creating a facets on a codelist, 2 options:

  1. if the catalog content is in one language (and there is no need to translate codelist in other language), use the default property eg. cl_spatialRepresentationType.default

  2. if you have a catalog containing a mix of languages without having all records translated in all languages, use the key eg. cl_spatialRepresentationType.key and do translation on the client side. See GnSearchModule.js to load extra codelist translations on the Angular app.

The second one is also required if you want the codelist to be translated in the user interface language (whatever the record language). The codelist translations are loaded by the application depending on the UI language.

Use the default property in record view. Depending on the UI language, the default property contains the translation in the UI language or fallback to the record default.

Thesaurus

Each thesaurus are described by the following fields:

  • th_{thesaurusId}Number with the count of non empty keywords
  • th_{thesaurusId}, an array of multilingual keyword which may contains a link (when using Anchor)
  • (optional) th_{thesaurusId}_tree containing hierarchy when broader terms are found. default property contains the record default language hierarchy, key property contains the hierarchy of broader terms keys. This can be used to build tree depending on UI language (thesaurus translations has to be loaded by the client app).
{
	"th_httpinspireeceuropaeumetadatacodelistPriorityDatasetPriorityDatasetNumber": "3",
	"th_httpinspireeceuropaeumetadatacodelistPriorityDatasetPriorityDataset": [{
			"default": "Agglomerations - industrial noise exposure delineation (Noise Directive)",
			"langfre": "Agglomerations - industrial noise exposure delineation (Noise Directive)",
			"link": "http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/Agglomerations-IndustrialNoiseExposureDelineation-dir-2002-49"
		},
		{
			"default": "Agglomerations - noise exposure delineation day-evening-night (Noise Directive)",
			"langfre": "Agglomerations - noise exposure delineation day-evening-night (Noise Directive)",
			"link": "http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/Agglomerations-NoiseExposureDelineationDEN-dir-2002-49"
		},
		{
			"default": "Designated waters (Water Framework Directive)",
			"langfre": "Designated waters (Water Framework Directive)",
			"link": "http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/DesignatedWaters-dir-2000-60"
		}
	],
	"th_httpinspireeceuropaeumetadatacodelistPriorityDatasetPriorityDataset_tree": {
		"default": [
			"Directive 2000/60/EC",
			"Directive 2000/60/EC^Protected areas (Water Framework Directive)",
			"Directive 2000/60/EC^Protected areas (Water Framework Directive)^Designated waters (Water Framework Directive)",
			"Directive 2002/49/EC",
			"Directive 2002/49/EC^Environmental noise exposure (Noise Directive)",
			"Directive 2002/49/EC^Environmental noise exposure (Noise Directive)^Agglomerations - industrial noise exposure delineation (Noise Directive)",
			"Directive 2002/49/EC^Environmental noise exposure (Noise Directive)^Agglomerations - noise exposure delineation (Noise Directive)",
			"Directive 2002/49/EC^Environmental noise exposure (Noise Directive)^Agglomerations - noise exposure delineation (Noise Directive)^Agglomerations - noise exposure delineation day-evening-night (Noise Directive)"
		],
		"key": [
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2000-60",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2000-60^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/ProtectedAreas-dir-2000-60",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2000-60^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/ProtectedAreas-dir-2000-60^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/DesignatedWaters-dir-2000-60",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2002-49",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/EnvironmentalNoiseExposure-dir-2002-49",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/EnvironmentalNoiseExposure-dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/Agglomerations-IndustrialNoiseExposureDelineation-dir-2002-49",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/EnvironmentalNoiseExposure-dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/Agglomerations-NoiseExposureDelineation-dir-2002-49",
			"http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/EnvironmentalNoiseExposure-dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/Agglomerations-NoiseExposureDelineation-dir-2002-49^http://inspire.ec.europa.eu/metadata-codelist/PriorityDataset/Agglomerations-NoiseExposureDelineationDEN-dir-2002-49"
		]
	}
}

Other types

Index document also contains other types of object for field like:

  • geom representing the bounding boxes of the record stored as GeoJSON
  • contact stored as simple fields and as object:
  "Org": "Direction Asset Management (SPW - Mobilité et Infrastructure)",
  "pointOfContactOrg": "Direction Asset Management (SPW - Mobilité et Infrastructure)",
  "contact: [
    {
      "organisation": "Direction Asset Management (SPW - Mobilité et Infrastructure)",
      "role": "pointOfContact",
      "email": "frederic.plumier@spw.wallonie.be",
      "website": "",
      "logo": "",
      "individual": "",
      "position": "",
      "phone": "",
      "address": "Boulevard du Nord, 8, NAMUR, 5000, Belgique"
    }
  ]
  • link
  "link": [
    {
      "protocol": "WWW:LINK-1.0-http--link",
      "url": "http://geoapps.spw.wallonie.be/portailRoutes",
      "name": "Portail cartographique des routes - Application sécurisée",
      "description": "Application de consultation des routes et autoroutes de Wallonie. Cette application est sécurisée et n'est accessible que pour les agents de la DGO1 du SPW.",
      "applicationProfile": "",
      "group": 0
    },
  • recordLink
  "recordLink": [
    {
      "type": "siblings",
      "associationType": "isComposedOf",
      "initiativeType": "collection",
      "to": "f010eda4-e791-44b1-8b2a-309f352f7d8f",
      "url": "",
      "title": "",
      "origin": "catalog"
    },

Aggregations configuration

Some examples:

Simple aggregation on a field:

  "cl_hierarchyLevel.key": {
    "terms": {
      "field": "cl_hierarchyLevel.key"
    }
  }

For codelist, use .default for not multilingual catalogue with one language only UI.

   "cl_spatialRepresentationType.default": {
     "terms": {
       "field": "cl_spatialRepresentationType.default",
       "size": 10
     }
   },

Use .key for codelist for multilingual catalogue. The codelist translation needs to be loaded in the client app. See GnSearchModule.js.

   "cl_spatialRepresentationType.key": {
     "terms": {
       "field": "cl_spatialRepresentationType.key",
       "size": 10
     }
   },

GEMET configuration for non multilingual catalog default property contains the default language so if all records have in the same language there is no mix of languages. In the user interface, this language is displayed.

  "th_gemet_tree.default": {
    "terms": {
      "field": "th_gemet_tree.default",
      "size": 100,
      "order" : { "_key" : "asc" },
      "include": "[^\^]+^?[^\^]+"
      // Limit to 2 levels
    }
  },

If records are multilingual, languages are mixed:
image

GEMET configuration for multilingual catalog. The key is translated on client side by loading required concepts using the thesaurus API.

  "th_gemet_tree.key": {
    "terms": {
      "field": "th_gemet_tree.key",
      "size": 100,
      "order" : { "_key" : "asc" },
      "include": "[^\^]+^?[^\^]+"
      // Limit to 2 levels
    }
  }

With the key, french and english translations are considered equivalent:

image

Aggregation based on queries. One query will define one bucket in the aggregation.

  "availableInServices": {
    "filters": {
      //"other_bucket_key": "others",
      // But does not support to click on it
      "filters": {
        "availableInViewService": {
          "query_string": {
            "query": "+linkProtocol:/OGC:WMS.*/"
          }
        },
        "availableInDownloadService": {
          "query_string": {
            "query": "+linkProtocol:/OGC:WFS.*/"
          }
        }
      }
    }
  }

Key like availableInViewService may not be available in the translations. Use the translation API to add your custom translation in the database.

To enable filtering in a facet, add an include property:

  "tag.default": {
    "terms": {
      "field": "tag.default",
      "include": ".*",
      "size": 10
    }
  }

include and exclude properties can be used to filter values too.

Aggregations can be collapsed by default and visible to users depending on roles:

 "dateStamp" : {
    "userHasRole": "isReviewerOrMore",
    "collapsed": true,
    "auto_date_histogram" : {
      "field" : "dateStamp",
      "buckets": 50
    }

(Experimental) A tree field which contains a URI
eg. http://www.ifremer.fr/thesaurus/sextant/theme#52
but with a translation which contains a hierarchy with a custom separator
/Regulation and Management/Technical and Management Zonations/Sensitive Zones

   "th_sextant-theme_tree.key": {
     "terms": {
       "field": "th_sextant-theme_tree.key",
       "size": 100,
       "order" : { "_key" : "asc" }
     },
     "meta": {
       "translateOnLoad": true,
       "treeKeySeparator": "/"
     }
   }

Other improvements

  • Active filter translate each levels of tree aggregations:

image

  • More robust multilingual indexing

API changes

  • API / vocabularies/keyword / Can return JSON response

Work supported by Ifremer / Sextant.

Codelists are stored as an object like titles, abstract, ...

If the record is multilingual, codelist translations are stored in the index for the record languages:

```json
"cl_spatialRepresentationType" : {
    "key": "grid",
    "default": "Grid",
    "langeng": "Grid",
    "langfre": "Grid",
    "text": "", > inner text of the element,
    "link": "./resources/codeList.xml#MD_SpatialRepresentationTypeCode",
  }

```

When creating a facets on a codelist, 2 options:

1. if the catalog content is in one language (and there is no need to translate codelist in other language), use the default eg. cl_spatialRepresentationType.default

1. if you have a catalog containing a mix of languages without having all records translated in all languages, use the key eg. cl_spatialRepresentationType.key and do translation on the client side. See `GnSearchModule.js` to load extra codelist translations on the Angular app.

The second one is also required if you want the codelist to be translated in the user interface language (whatever the record language). The codelist translations are loaded by the application depending on the UI language.

Use the default property in record view. Depending on the UI language, the default property contains the translation in the UI language or fallback to the record default.
@fxprunayre fxprunayre added this to the 4.0.2 milestone Nov 17, 2020
Each thesaurus are described by the following fields:

* th_{thesaurusId}Number with the count of non empty keywords
* th_{thesaurusId}, an array of multilingual keyword which may contains a link (when using Anchor)
* (optional) th_{thesaurusId}_tree containing hierarchy when broader terms are found. default property contains the record default language hierarchy, key property contains the hierarchy of broader terms keys. This can be used to build tree depending on UI language (thesaurus translations has to be loaded by the client app).
* Keyword default language is now set to the first label with a non empty value.
* API / Keyword / Can return JSON response
* Agg tree loads translation when needed using promise
* Agg / Add key translator to ignore multilingual field suffix
@fxprunayre fxprunayre marked this pull request as ready for review November 19, 2020 07:37
@fxprunayre fxprunayre merged commit 951ea02 into geonetwork:4.0.x Nov 19, 2020
fxprunayre added a commit that referenced this pull request Nov 27, 2020
fxprunayre added a commit that referenced this pull request Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant