Spotlight Archives - DBpedia Association

Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021

Tue, 11 May 2021 08:45:28 +0000

On Tuesday the 4th of May, DBpedia organized a tutorial at the Knowledge Graph Conference (KGC) 2021. The ultimate goal of the tutorial was to teach the participants all relevant tech around DBpedia, the knowledge graph, the infrastructure and possible use cases. The tutorial aimed at existing and potential new users of DBpedia, developers that wish to learn how to replicate DBpedia infrastructure, service providers, data providers as well as data scientists.

Following, we will give you a brief retrospective about the presentations. For further details of the presentations follow the link to the slides.

Opening

The tutorial which was held online was opened by Milan Dojchinovski (InfAI / DBpedia Association / CTU in Prague) with some general information about the program of the tutorial, the scope and the technical information.

DBpedia in a Nutshell session

After the short opening, Milan continued with the first topic, the background on the DBpedia Association – how it all started and the evolution of DBpedia. Linked Data and the LOD cloud were also addressed as well as the mappings, extractors and data groups (e.g. mappings, generic, text, wikidata). Then Ontology was presented and explained. Milan concluded the first topic with information on the DBpedia SPARQL endpoint and DBpedia Databus platform.

Getting Started with DBpedia session

The next point on the program was split into two subtopics. First of all, Jan Forberg (InfAI / DBpedia Association) explained where to find data including DBpedia SPARQL endpoint, the DBpedia Databus platform as a repository for DBpedia and related datasets and the novel “collections” concept. Moreover, the DBpedia services such as DBpedia Lookup and DBpedia Spotlight were presented.

Afterwards Jan explained how to use the data hosted on the Databus. Starting by selecting particular artifacts, he explained the Docker container where data can be downloaded and a simple bash script to submit SPARQL and retrieve specific data artifacts.

Building National Knowledge Graphs using DBpedia Tech

In the following session, Johannes Frey (InfAI / DBpedia Association) explied how to build national knowledge graphs using DBpedia Tech. The use case of the Dutch National Knowledge Graph was explained as an example. The Dutch National Knowledge Graph was presented during the DBpedia Hackathon 2020. For further information feel free to have a look at the presentations of the Hackathon 2020 here https://tinyurl.com/kgia-2020-dnkg. Please also see all relevant data here https://databus.dbpedia.org/dnkg/fusion/dutch-national-kg/.

DBpedia Technology Stack

Talking about DBpedia Technology Stack, Jan started with the DBpedia Databus platform. He explained how the Databus platform works, the benefits (DatalDs and Simple Retrieval), the Databus SPARQL endpoints and the Web API and Maven Plugin. After that, Jan presented the Dockersized Services including DBpedia Virtuoso and DBpedia Plugin, DBpedia Spotlight (incl. use cases) and DBpedia Lookup.

Marvin Hofer (InfAI / DBpedia Association) then explained the DBpedia release process on the Databus and presented his work on debugging DBpedia and the DBpedia Mods technology. Marvin also explained the quality assurance process using the concept of minidumps.

Afterwards, Johannes explained (Pre)fusion, ID management and the novel concept of cartridges.

Subsequently, Denis Streitmatter (InfAI / DBpedia Association) presented the DBpedia Archivo ontology manager and how to include your ontology here. He also explained various use cases, e.g. how to find ontology, how to test your ontology and how to back it up. Then he presented the ontology tests 5 star schema and the SHACL based tests for ontologies. Please read the official DBpedia Archivo call here https://www.dbpedia.org/blog/dbpedia-archivo-call-to-improve-the-web-of-ontologies/.

Contributions to DBpedia

As it got to the end of the tutorial, Milan explained how to improve mappings or introduce new mappings. He talked about improvement of the DBpedia Information Extraction Framework as well as contributing DBpedia tests. Then he explained about contributing mappings and links for knowledge cartridges and how to write Mods for the Databus.

In case you missed the event, our presentation is also available on the DBpeda event page. Further insights, feedback and photos about the event are available on Twitter (#DBpediaTutorial hashtag).

We are now looking forward to the next DBpedia tutorial, which will be held on September 1, 2021 co-located with the LDK conference in Zaragoza, Spain. Check more details here and register now! Furthermore, we will organize the DBpedia Day on September 9, 2021 at the Semantics Conference in Amsterdam. We are looking forward to meeting all Dutch DBpedians there!

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Yours DBpedia Association

The post Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 appeared first on DBpedia Association.

]]>

FinScience: leveraging DBpedia tools for fintech applications

Wed, 16 Dec 2020 13:55:05 +0000

DBpedia Member Features – In the last few weeks, we gave DBpedia members the chance to present special products, tools and applications and share them with the community. We already published several posts in which DBpedia members provided unique insights. This week we will continue with FinScience. They will present their latest products, solutions and challenges. Have fun while reading!

by FinScience

A brief presentation of who we are

FinScience is an Italian data-driven fintech company founded in 2017 in Milan by Google’s former senior managers and Alternative Data experts, who have combined their digital and financial expertise. FinScience, thus, originates from this merger of the world of Finance and the world of Data Science.
The company leverages founders’ experiences concerning Data Governance, Data Modeling and Data Platforms solutions. These are further enriched through the tech role in the European Consortium SSIX (Horizon 2020 program) focused on the building of a Social Sentiment for financial purposes. FinScience applies proprietary AI-based technologies to combine financial data/insights with alternative data in order to generate new investment ideas, ESG scores and non-conventional lists of companies that can be included in investment products by financial operators.

The FinScience’s data analysis pipeline is strongly grounded on the DBpedia ontology: the greatest value, according to our experience, is given by the possibility to connect knowledge in different languages, to query automatically-extracted structured information and to have rather frequently updated models.

Products and solutions

FinScience daily retrieves content from the web. About 1.5 million web pages are visited every day on about 35.000 different domains. The content of these pages is extracted, interpreted and analysed via Natural Language Processing techniques to identify valuable information and sources. Thanks to the structured information based on the DBpedia ontology, we can apply our proprietary AI algorithms to suggest to our customers the right investment opportunities.Our products are mainly based on the integration of this purely digital data – we call it “alternative data”- with traditional sources coming from the world of finance and sustainability. We describe these products briefly:

FinScience Platform for traders: it leverages the power of machine learning to help traders monitor specific companies, spot new trends in the financial market, give access to an high added-value selection of companies and themes.
ESG scoring: we provide an assessment of corporate ESG performance, by combining internal data (traditional, self-disclosed data) with external ‘alternative’ data (stakeholder-generated data) in order to measure the gap between what the companies communicate and what is stakeholder perception related to corporate sustainability commitments.
Thematic selections of listed companies : we create Trend-Driven selections oriented towards innovative themes: our data, together with the analysis of financial specialists, contribute to the selection of a set of listed companies related to trending themes such as the Green New Deal, the 5G technology or new medtech applications.

FinScience and DBpedia

As mentioned before, FinScience is strongly grounded in the DBpedia ontology, since we employ Spotlight to perform Named Entity Recognition (NER), namely automatic annotation of entities in a text. The NER task is performed with a two step procedure. The first step consists in annotating the named entity of a text using DBpedia Spotlight. In particular, Spotlight links a mention in the text (that is identified by its name and its context within the text) to the DBpedia entity that maximizes the joint probability of occurrence of both. The model is pre-trained on texts extracted from Wikipedia. Note that each entity is represented by a link to a DBpedia page (see, e.g. http://dbpedia.org/page/Eni ), a DBpedia type indicating the type of the entity according to this ontology and other information.

Another interesting feature of this approach is that we have a one to one mapping of the italian and english entities (and in general any language supported by DBpedia), allowing us to have a unified representation of an entity in the two languages. We are able to obtain this kind of information by exploiting the potential of DBpedia Virtuoso, which allows us to access DBpedia dataset via SPARQL. By identifying the entities mentioned in the online content, we can understand which topics are mentioned and thus identify companies and trends that are spreading in the digital ecosystem as well as analyzing how they are related to each other.

Challenges and next steps

One of the toughest challenges for us is to find an optimal way to update the models used by DBpedia Spotlight. Every day new entities and concepts arise and we are willing to recognise them in the news we analyze. And that is not all. In addition to recognizing new concepts, we need to be able to track an entity through all the updated versions of the model. In this way, we will not only be able to identify entities, but we will also have evidence of when some concepts were first generated. And we will know how they have changed over time, regardless of the names that have been used to identify them.

We are strongly involved in the DBpedia community and we try to contribute with our know-how. Particularly FinScience will contribute on infrastructure and Dockerfiles as well as on finding issues on the new released project (for instance, wikistats-extractor).

A big thank you to FinSciene for presenting their products, challenges and contribution to DBpedia.

Yours,

DBpedia Association

The post FinScience: leveraging DBpedia tools for fintech applications appeared first on DBpedia Association.

]]>