XML: example, features and capabilities, pros and cons of the format

XML - accepted as a recommendation of the World Wide Web Consortium (W3C), is similar to HTML. It is less complex and easy to use, belongs to a subset of the SGML standard markup language used to create the structure of the document. The basic building block of XML is the element defined by the start and end tags. All data in an XML document is contained in an external element known as the root element. Names describe its contents. And the structure describes the relationship between the elements in the XML example. It supports nested or internal elements with a hierarchical structure.

Story

XML has emerged as a way to overcome the shortcomings of its two predecessors - SGML and HTML. In the late 1980s, before the advent of the Internet, digital media publishers realized the various benefits of SGML for dynamically displaying information. The language was an extremely powerful and extensible tool for semantic markup and is especially useful for cataloging and indexing data. SGML can still be used to create an infinite number of markup languages.

However, SGML remains quite complex and expensive, especially for everyday use on the Internet. Adding SGML to a word processor doubles or even triples its price. Finally, commercial browsers made it clear that they did not intend to ever support SGML.

One of SGML's most popular applications has been the development of the HTML hypertext markup language, created by Tim Berners Lee in the late 1980s. Since its inception, HTML has been a victim of its own popularity, as it has been quickly adopted and expanded in many ways that go beyond its original vision.





W3C Approved XML Version 1.0




It remains popular today, although it is considered unsuitable as a general-purpose data storage format, unlike other languages. Example: XML bridges the gap by being readable by both humans and computers, while being flexible enough to support data exchange independent of the platform and architecture. In 1998, the W3C approved version 1.0 of XML, so a new language was officially created.

Element structure

There are two ways to define the structure of an XML document (Data Type Definition (DTD) and XML Schema). DTD documents were introduced by SGML. They correspond to the extended form of Backus Naur (EBNF).

XML Schema Documents




XML schema documents are written using code syntax. Both the DTD and XML schema allow you to set restriction rules. They apply to the contents of documents of an instance of the same code. They take the form of rules to validate the XML structure.





All XML documents have one root element that contains subelements, their subelements, etc. This leads to a hierarchical tree structure in them.

Thanks to SGML development, document type definitions are more suitable for document-oriented applications such as HTML. HTML uses DTD. Although it can determine the structure of the document, it cannot determine the rules that should be applied to the data. That is, all data contained in an XML document in DTD is processed as a string. This is suitable for markup languages. But it is not suitable when the application needs to control the data contained in it.

An XML document is considered “correctly formed,” that is, it can be read and understood by the parser if its format conforms to the XML specification and is correctly marked up. And the elements are properly nested. The code also supports the ability to define attributes for elements and describe characteristics in the start tag. Documents can be very simple for XML, an example of the text "Hello world!":

<?xml version="1.0" encoding="UTF-8"?>



<text>



<para>hello world</para>



</text>



Firewall Security Guide

The security features and capabilities of the XML firewall make it a valuable and substantial complement to any organization’s web service strategy. Compared to other systems, the XML firewall is capable of conducting deep validation and also has many other features that make it the main competitor to protect data and prevent vulnerabilities and threats.

Manufacturers are constantly adding new features to keep attackers in suspense and tension, preventing their malicious actions. Unfortunately, some firewalls on the market today are still out of range when it comes to protecting messages and web services. XML firewalls are more powerful than traditional ones. An example of an XML file that demonstrates the operation of a firewall.

Firewall Security Guide




Traditional firewalls work well with regular traffic, but shielding data streams requires technology changes to provide protection. This makes the XML firewall one of the important elements of securing web services.

Businesses developing web applications and web services based on this code are increasingly turning to the Security Assertion Markup Language (SAML) to convey credentials and authorization information, so they need to protect themselves from XML and SAML protocol attacks at the application level. . The XML firewall can be an adequate security tool for securing tiered systems.

Management in SOA

Many SOA installations face performance problems because they lack proper data management. Despite all the hype and buzzwords that currently surround SOA and seek to integrate a service-oriented architecture into their IT infrastructure, developers still do not take into account the problems associated with data integration and management in their projects.

An XML sitemap example and usage nuances demonstrate the capabilities of the language.

Data Management in SOA




The bottom line is to recognize the value of the organization’s data, wherever it is (under the SOA umbrella or outside it) and find methods that allow them to collect and transmit information between producers and consumers with minimal complexity. An example of an XML SOA file for information security is presented below.

Sample XML SOA file.




By creating metadata using XML, and then by creating XSLT applications to transfer them to and from SOA components, developers get numerous benefits:

  1. They create tools to capture key data elements, interactions, and semantics. They make it easy to move them between SOA components or between each other. They also document basic concepts and assumptions about the data they use and the necessary metadata.
  2. Clear abstract representations of the information flows that are transmitted between the components and between themselves (as well as the nature and scale of these flows) make it possible to redirect them as new business needs arise and as new manufacturers and consumers come together in a big picture.
  3. XML and messaging protocols, such as SOAP, really make it easy to abstract data and move it. But they also increase the importance of where the data is located, how it receives or maintains the proper context, and how to associate certain syntax, semantics, and accuracy checks with the real information that they represent.

Parser process

One of the goals of the XML format was to improve raw data formats, such as plain text, by including detailed descriptions of the meaning of the content. Now, to be able to read XML files, use a parser. It (basically) provides the contents of the document through the so-called application programming API. In other words, the client application accesses the contents of the XML document through an interface instead of interpreting the code on its own. This can be demonstrated using parser JAVA XML as an example.

The parser analyzes the correctly formed document embedded in the string field and transfers the analyzed data to the output field of the record.

When configuring an XML parser, the user specifies two fields. The first contains a document, the second contains assignments for the analyzed results. You can define a separator element. This is done in order to divide the document into several values. If it is not defined, XML Parser passes the entire document to the target field in the form of a map.

When defining this element, you can use its or XPath simplified expression. Use an element when it is directly below the root node and a simplified XPath expression (for deeper access to data in an XML document).

If the XML document has more than one value, you can return the first value as a list or generate a record for each value. When it is created, the processor includes all other incoming fields in the generated record. If it is necessary to analyze several of them, the processor, due to the multitude of values, includes other input fields for each generated record.

You can configure the processor so that XPath is included in each parsed element and XML in the field attributes. It also puts each namespace in the header attribute of the xmlns record. You can also configure the processor to include attributes and namespace declarations in the record (as a field attribute). By default, it already includes XML attributes and namespace declarations as fields.

When configuring the XML parser, specify the field to parse and the output field to use. In the "Properties" panel on the "General" tab, configure the properties shown in the table below.

Appointment

Description

Title

Stage name, similar to the example of reading XML 1s 8 3

Description

Optional description

Required fields

Fields that must include data for the recording transferred to the scene

You can include the fields that the scene uses.

Records that do not include all required fields are processed based on the error handling configured for the pipeline

Background

Conditions that must be evaluated as TRUE to allow the record to enter the processing stage. Click the Add button to create additional prerequisites.

Records that do not meet all the prerequisites are processed based on the error handling configured for the step.

Write error

Error record processing for the stage:

  • Discard - removes
  • Send to Error - sends to the pipeline for error handling

Create a scalable DOM

Unlike the DOM, SAX is event-based, so it does not create representations of the input document tree in memory. SAX processes the element of the input document by elements and can report events and important data to callback methods in the application.

There are three ways to create a DOM in the Java XDK:

  1. Parsing a document using DOMParser. This was the traditional approach of HDK.
  2. Create a scalable DOM using the XMLDOMImplementation factory method.
  3. Using the XMLDocument constructor. This is not a common solution in HDK.

The document is analyzed as a sequence of linear events.

In general, the SAX API provides the following benefits:

  1. The method is useful for search operations and other programs that do not need to be manipulated by the XML tree.
  2. It does not consume significant memory resources.
  3. Faster than the DOM when retrieving XML documents from a database.
  4. The JAXP API allows you to connect an SAX or DOM parser implementation.
  5. The SAX and DOM APIs provided by Oracle XDK are examples of specific implementations supported by JAXP.

In general, the advantage of JAXP is that the user can use it to write compatible applications. If an application uses features available through JAXP, it can very easily switch implementations.

The main drawback of JAXP is that it runs slower than the manufacturer-specific APIs.

Message Creation Example

When creating XML documents, it is useful to simultaneously create opening and closing elements. After creating the tags, the user must fill in the content. One of the fatal errors for XML is the forgetfulness of closing closing tags when creating elements.

First you need to declare a version of XML. After the version declaration, the root element for the document is determined. A message is used as the root element, an example of JAVA XML:

<? xml version = "1.0" encoding = "iso-8859-1"?> <message> </ message>



The way to describe relationships in XML is the terminology of the parent and descendant. In this example, the parent or “root” element is one that has a child. Here is a simple example of reading XML when they are linked in code. Indent the code to show that the element is a child of another:

<?xml version="1.0" encoding="iso-8859-1"?> <message>



<email> </email>



</message>



Now that there is an XML declaration, the root and child elements determine the information that needs to be disclosed in the email. Suppose you want to save information about the sender, recipients, subject and content of the text. Since information about the sender and recipients is usually located in the document header, they are considered as children of the parent element. So in this case, an example XML data would look like this:

<?xml version="1.0" encoding="iso-8859-1"?> <message>



<email>



<header>



<sender>info@.edu</sender> <recipient>info@.edu</recipient>



</header> <subject>Re: XML Lesson </subject> <text>My XML project. </text>



</email>



</message>



Writing a message document

Some of the information in the letter that you need to know includes the sender, the recipient, and the text of the letter. In addition, you need to know the date when the letter was sent and what greeting was used to start the message. It will look in XML (with sample code) as follows:

<?xml version="1.0" encoding="iso-8859-1"?><message>



<letter>



<letterhead>



<sender>MyName</sender> <recipient>YourName</recipient> <date>2013</date>



</letterhead> <text>



<salutation>Hello</salutation>



How are you?



</text>



</letter>



</message>



Attributes are added if you want to track whether there were replies to these messages or not. Instead of creating an additional element with a name, assign an attribute to the element and indicate whether this document was a response to a previous message. JAVA XML example:

<email reply="yes">







<letter reply="no">



When creating XML documents, it’s always useful to spend a little time deciding what information you want to keep, and also what relationships the elements will have.

Developer Apps

At its core, XML allows a software developer to create a dictionary and use it to describe data. For example, when exchanging data between computers, the number 42 does not make sense. And if the user indicates the same value in degrees, then it will make sense, since the processor temperature is expressed in degrees Celsius. Only when the sender and receiver have an agreed understanding of the meaning of the information can they use it for its intended purpose.

Before developing XML between systems, it was necessary to obtain a certain number of a priori conventions about data and its meaning. With the development of XML, it is possible to exchange data between systems without any prior consent, provided that both systems understand the same dictionary, that is, they "speak" the same language. Since the development of the code, several such applications have appeared.

Web publishing - XML ​​allows you to create interactive pages, helps the customer customize these pages and makes creating e-commerce applications with a more intuitive XML format, an example is presented below.

Web search and web task automation




Web search and automation of web tasks - the code determines the type of information contained in the document, making it easy to obtain useful results when searching on the Internet.

General applications - XML ​​provides a standard method for accessing information, making it easy for devices of all kinds to use, store, transfer and display data.

E-business applications - XML ​​implementation makes electronic data interchange (EDI) more accessible for the exchange of information, transactions between enterprises and transactions between customers. An example of an XML request for an event handler, opening a connection, and sending requests is as follows.

Event Handler Request XML Example




Metadata Applications - The code facilitates the expression of metadata in a portable, reusable format.

Common Computing - XML ​​provides portable and structured types of information for display on pervasive (wireless) computing devices such as personal digital assistants (PDAs), cell phones, and others.

Advantages and disadvantages of the language

For relational database systems, it is not possible to process data regardless of their context. Therefore, the requirements of e-business are not met. Traditional databases cannot process audio, video, or complex data.

Language Benefits:

  1. Open and extensible. The XML structure is adaptable and can be modified to fit the industry vocabulary. Users can add items as needed.
  2. Internationalization. Multilingual documents and Unicode standards are supported by XML, which is important for e-business applications.
  3. Future-oriented technologies. W3C supports XML, which is supported by major software vendors. It is also used in an increasing number of industries.
  4. Self-describing. Business applications have other tasks besides the simple presentation of content; therefore, XML is used because it provides complete usability of the data and its correct presentation. Thus, traditional database systems are more preferable for XML.
  5. Integration of traditional databases and formats. XML documents support all types of data: classic (text, numbers), multimedia (sounds), active formats (Java applets, active x-components).
  6. Changes in presentation. XML style sheets can be used to modify documents or websites without changing the actual data.
  7. One server. Data from different databases and multiple servers can be part of an XML document. That is, the whole WWW is converted into one database.
Convert to a single database




Thus, an example XML document demonstrates in the most obvious way that this language has been extremely successful in the field of markup, data and metadata exchange, ensuring their interaction, transparent transportation and storage. Given the current level of interest in next-generation enterprise systems, the use of XML will grow as it is the core technology for web services, portal development, and service-oriented architectures.




All Articles