Tail latency is a term used to describe the high-percentile response times seen in a system. This is usually measured at the 95th, 99th, or 99.9th percentile, not the average latency. In distributed systems, cloud computing, and large-scale web services, even a small number of slow requests can make the user experience and system performance much worse. Tail latency often happens because of things like resource contention, network variability, garbage collection pauses, and hardware heterogeneity. A major problem in system design is managing tail latency, because lowering average latency doesn't always make the worst-case performance better. To lessen its effects, people often use techniques like request hedging, replication, load balancing, and adaptive timeouts. In latency-sensitive applications like search engines, financial systems, and real-time services, where service-level objectives (SLOs) are often based on high-percentile latencies, it is especially important to understand and improve tail latency.
Knowledge graph embedding
In representation learning, knowledge graph embedding (KGE), also called knowledge representation learning (KRL), or multi-relation learning, is a machine learning task of learning a low-dimensional representation of a knowledge graph's entities and relations while preserving their semantic meaning. Leveraging their embedded representation, knowledge graphs can be used for various applications such as link prediction, triple classification, entity recognition, clustering, and relation extraction. == Definition == A knowledge graph G = { E , R , F } {\displaystyle {\mathcal {G}}=\{E,R,F\}} is a collection of entities E {\displaystyle E} , relations R {\displaystyle R} , and facts F {\displaystyle F} . A fact is a triple ( h , r , t ) ∈ F {\displaystyle (h,r,t)\in F} that denotes a link r ∈ R {\displaystyle r\in R} between the head h ∈ E {\displaystyle h\in E} and the tail t ∈ E {\displaystyle t\in E} of the triple. Another notation that is often used in the literature to represent a triple (or fact) is ⟨ head , relation , tail ⟩ {\displaystyle \langle {\text{head}},{\text{relation}},{\text{tail}}\rangle } . This notation is called the Resource Description Framework (RDF). A knowledge graph represents the knowledge related to a specific domain; leveraging this structured representation, it is possible to infer a piece of new knowledge from it after some refinement steps. However, nowadays, people have to deal with the sparsity of data and the computational inefficiency to use them in a real-world application. The embedding of a knowledge graph is a function that translates each entity and each relation into a vector of a given dimension d {\displaystyle d} , called embedding dimension. It is even possible to embed the entities and relations with different dimensions. The embedding vectors can then be used for other tasks. A knowledge graph embedding is characterized by four aspects: Representation space: The low-dimensional space in which the entities and relations are represented. Scoring function: A measure of the goodness of a triple-embedded representation. Encoding models: The modality in which the embedded representation of the entities and relations interact with each other. Additional information: Any additional information coming from the knowledge graph that can enrich the embedded representation. Usually, an ad hoc scoring function is integrated into the general scoring function for each additional piece of information. == Embedding procedure == All algorithms for creating a knowledge graph embedding follow the same approach. First, the embedding vectors are initialized to random values. Then, they are iteratively optimized using a training set of triples. In each iteration, a batch of size b {\displaystyle b} triples is sampled from the training set, and a triple from it is sampled and corrupted—i.e., a triple that does not represent a true fact in the knowledge graph. The corruption of a triple involves substituting the head or the tail (or both) of the triple with another entity that makes the fact false. The original triple and the corrupted triple are added in the training batch, and then the embeddings are updated, optimizing a scoring function. Iteration stops when a stop condition is reached. Usually, the stop condition depends on the overfitting of the training set. At the end, the learned embeddings should have extracted semantic meaning from the training triples and should correctly predict unseen true facts in the knowledge graph. === Pseudocode === The following is the pseudocode for the general embedding procedure. algorithm Compute entity and relation embeddings input: The training set S = { ( h , r , t ) } {\displaystyle S=\{(h,r,t)\}} , entity set E {\displaystyle E} , relation set R {\displaystyle R} , embedding dimension k {\displaystyle k} output: Entity and relation embeddings initialization: the entities e {\displaystyle e} and relations r {\displaystyle r} embeddings (vectors) are randomly initialized while stop condition do S b a t c h ← s a m p l e ( S , b ) {\displaystyle S_{batch}\leftarrow sample(S,b)} // Sample a batch from the training set for each ( h , r , t ) {\displaystyle (h,r,t)} in S b a t c h {\displaystyle S_{batch}} do ( h ′ , r , t ′ ) ← s a m p l e ( S ′ ) {\displaystyle (h',r,t')\leftarrow sample(S')} // Sample a corrupted fact T b a t c h ← T b a t c h ∪ { ( ( h , r , t ) , ( h ′ , r , t ′ ) ) } {\displaystyle T_{batch}\leftarrow T_{batch}\cup \{((h,r,t),(h',r,t'))\}} end for Update embeddings by minimizing the loss function end while == Performance indicators == These indexes are often used to measure the embedding quality of a model. The simplicity of the indexes makes them very suitable for evaluating the performance of an embedding algorithm even on a large scale. Given Q {\displaystyle {\ce {Q}}} as the set of all ranked predictions of a model, it is possible to define three different performance indexes: Hits@K, MR, and MRR. === Hits@K === Hits@K or in short, H@K, is a performance index that measures the probability to find the correct prediction in the first top K model predictions. Usually, it is used k = 10 {\displaystyle k=10} . Hits@K reflects the accuracy of an embedding model to predict the relation between two given triples correctly. Hits@K = | { q ∈ Q : q < k } | | Q | ∈ [ 0 , 1 ] {\displaystyle ={\frac {|\{q\in Q:q Audience capture is the phenomenon where an influencer is affected by their audience, catering to it with what they believe it wants to hear or is willing to pay for. This creates a positive feedback loop, which can lead the influencer to express more extreme views and behaviors. A famous example of audience capture can be found in the story of the online influencer Nicholas Perry, known as Nikocado Avocado. Perry started off on YouTube with videos of himself playing the violin and supporting veganism. He then shifted to videos of himself eating known as mukbang. Audience capture led him to more and more extreme eating leading him in turn to obesity and poor health. The effect can cause ideological media creators to become more politically radical, based on the feedback of their audience. A content reference identifier or CRID is a concept from the standardization work done by the TV-Anytime forum. It is or closely matches the concept of the Uniform Resource Locator, or URL, as used on the World-Wide Web: A unit of content, in a broadcast stream, can be referred to by its globally unique CRID in the same way that a webpage can be referred to by its globally unique URL on the web. The concept of CRID permits referencing contents unambiguously, regardless of their location, i.e., without knowing specific broadcast information (time, date and channel) or how to obtain them through a network, for instance, by means of a streaming service or by downloading a file from an Internet server. The receiver must be capable of resolving these unambiguous references, i.e. of translating them into specific data that will allow it to obtain the location of that content in order to acquire it. This makes it possible for recording processes to take place without knowing that information, and even without knowing beforehand the duration of the content to be recorded: a complete series by a simple click, a program that has not been scheduled yet, a set of programs grouped by a specific criterion... This framework allows for the separation between the reference to a given content (the CRID) and the necessary information to acquire it, which is called a “locator”. Each CRID may lead to one or more locators which will represent different copies of the same content. They may be identical copies broadcast in different channels or dates, or cost different prices. They may also be distinct copies with different technical parameters such as format or quality. It may also be the case that the resolution process of a CRID provides another CRID as a result (for example, its reference in a different network, where it has an alternative identifier assigned by a different operator) or a set of CRIDs (for instance, if the original CRID represents a TV series, in which case the resolution process would result in the list of CRIDs representing each episode). From the above it can be concluded that provided that a given content can belong to many groups (each possibly defined by distinctive qualities), it is possible that many CRIDs carry the same content. That is, several CRIDs may be resolved into the same locator. A CRID is not exactly a universal, unique and exclusive identifier for a given content. It is closely related to the authority that creates it, to the resolution service provider, and to the content provider in such a way that the same content may have different CRIDs depending on the field in which they are used (for example, a different one for each television operator that has the rights to broadcast the content). == Format == A CRID is specified much like URLs. In fact, a CRID is a so-called URI. Typically, the content creator, the broadcaster or a third party will use their DNS-names in a combination with a product-specific name to create globally unique CRIDs. That is, the syntax of a CRID is: crid://authority/data The authority field represents the entity that created the CRID and its format is that of a DNS name. The data field represents a string of characters that will unambiguously identify the content within the authority scope (it is a string of characters assigned by the authority itself). As an example, let's assume that BBC wanted to make a CRID for (all the programs of) the Olympics in China. It may have looked something like this crid://bbc.co.uk/olympics/2008/ This would be a group CRID, that is, a CRID representing a group of contents. Then, to refer to a specific event – such as the women's shot-put final – they could have used the following inside their metadata. crid://bbc.co.uk/olympics/2008/final/shotput/women Currently, four types of CRIDs are playing a major role in some unidirectional television networks: programme CRID, series CRID, group CRID, and recommendation CRID. One of the most important applications of CRIDs is the so-called series link recording function (SL) of modern digital video recorders (DVR, PVR). In turn, a locator is a string of characters that contains all the necessary information for a receiver to find and acquire a given content, whether it is received through a transport stream, located in local storage, downloaded as a file from an Internet server, or through a streaming service. For example, a DVB locator will include all the necessary parameters to identify a specific content within a transport stream: network, transport stream, service, table and/or event identifiers. The locators' format, as established in TV-Anytime, is quite generic and simple, and corresponds to: [transport-mechanism]:[specific-data] The first part of the locator's format (the transport mechanism) must be a string of characters that is unique for each mechanism (transport stream, local file, HTTP Internet access...). The second part must be unambiguous only within the scope of a given transport mechanism and will be standardized by the organism in charge of the regulation of the mechanism itself. For instance, a DVB locator to identify a content within the transport stream of networks that follow this standard would be: dvb://112.4a2.5ec;2d22~20121212T220000Z—PT01H30M which would indicate a content (identified by the string “2d22”) that airs on a channel available on a DVB network identified by the address “112.4a2.5ec” (network “112”, transport stream “4a2” and service “5ec”), on 12 December 2012 at 10 p.m. and with a duration of 90 minutes. == The location resolution process == The location resolution process is the procedure by which, starting from the CRID of a given content, one or several locators of that content are obtained. Resolving a CRID can be a direct process, which leads immediately to one or many locators, or it may also happen that in the first place one or many intermediate CRIDs are returned, which must undergo the same procedure to finally obtain one or several locators. This procedure involves some information elements, among which we find two structures named resolving authority record (RAR) and ContentReferencingTable, respectively. Consulting them repeatedly will take the receiver from a CRID to one or many locators that will allow it to acquire the content. The RAR table The RAR table is one or many data structures that provide the receiver, for each authority that submits CRIDs, information on the corresponding resolution service provider. Among other things, it informs about which mechanism is used to provide information to resolve the CRIDs from each authority. That is, one or many RAR records must exist for each authority that indicate the receiver where it has to go to resolve the CRIDs of that particular authority. For example, in the record of the figure (expressed by means of a XML structure, according to the XML Schema defined in the TV-Anytime) there is an authority called “tve.es”, whose resolution service provider is the entity “rtve.es”, available on the URL "http://tva.rtve.es/locres/tve", which means there is resolution information in that URL. These RAR records will have reached the receiver in an indefinite form, unimportant for the TV-Anytime specification, which will depend on the specific transport mechanism of the network to which the receiver is connected. Each family of standards that regulates distribution networks (DVB, ATSC, ISDB, IPTV...) will have previously defined such procedure, which will be used by devices certified according to those standards. The ContentReferencingTable table The second structure involved in the location resolution process is a proper resolution table which, given a content's CRID, returns one or several locators that enable the receiver to access an instance of that content, or one or many CRIDs that allow it to move forward in the resolution process. The figure shows an example of this second structure, an XML document according to the specifications of the XML Schema defined in TV-Anytime. In it, several sections are included ( In computer architecture, zero-overhead looping is a hardware feature found in some processors that enables loops to execute without the performance cost of traditional loop control instructions. Instead of software managing loop iterations, the processor's hardware handles repetition automatically, saving clock cycles and improving efficiency. This technique is commonly employed in digital signal processors (DSPs) and certain complex instruction set computer (CISC) architectures. == Background == In many instruction sets, a loop must be implemented by using instructions to increment or decrement a counter, check whether the end of the loop has been reached, and if not jump to the beginning of the loop so it can be repeated. Although this typically only represents around 3–16 bytes of space for each loop, even that small amount could be significant depending on the size of the CPU caches. More significant is that those instructions each take time to execute, time which is not spent doing useful work. The overhead of such a loop is apparent compared to a completely unrolled loop, in which the body of the loop is duplicated exactly as many times as it will execute. In that case, no space or execution time is wasted on instructions to repeat the body of the loop. However, the duplication caused by loop unrolling can significantly increase code size, and the larger size can even impact execution time due to cache misses. (For this reason, it's common to only partially unroll loops, such as transforming it into a loop which performs the work of four iterations in one step before repeating. This balances the advantages of unrolling with the overhead of repeating the loop.) Moreover, completely unrolling a loop is only possible for a limited number of loops: those whose number of iterations is known at compile time. For example, the following C code could be compiled and optimized into the following x86 assembly code: == Implementation == Processors with zero-overhead looping have machine instructions and registers to automatically repeat one or more instructions. Depending on the instructions available, these may only be suitable for count-controlled loops ("for loops") in which the number of iterations can be calculated in advance, or only for condition-controlled loops ("while loops") such as operations on null-terminated strings. === Examples === ==== PIC ==== In the PIC instruction set, the REPEAT and DO instructions implement zero-overhead loops. REPEAT only repeats a single instruction, while DO repeats a specified number of following instructions. ==== Blackfin ==== Blackfin offers two zero-overhead loops. The loops can be nested; if both hardware loops are configured with the same "loop end" address, loop 1 will behave as the inner loop and repeat, and loop 0 will behave as the outer loop and repeat only if loop 1 would not repeat. Loops are controlled using the LTx and LBx registers (x either 0 to 1) to set the top and bottom of the loop — that is, the first and last instructions to be executed, which can be the same for a loop with only one instruction — and LCx for the loop count. The loop repeats if LCx is nonzero at the end of the loop, in which case LCx is decremented. The loop registers can be set manually, but this would typically consume 6 bytes to load the registers, and 8–16 bytes to set up the values to be loaded. More common is to use the loop setup instruction (represented in assembly as either LOOP with pseudo-instruction LOOP_BEGIN and LOOP_END, or in a single line as LSETUP), which optionally initializes LCx and sets LTx and LBx to the desired values. This only requires 4–6 bytes, but can only set LTx and LBx within a limited range relative to where the loop setup instruction is located. ==== x86 ==== The x86 assembly language REP prefixes implement zero-overhead loops for a few instructions (namely MOVS/STOS/CMPS/LODS/SCAS). Depending on the prefix and the instruction, the instruction will be repeated a number of times with (E)CX holding the repeat count, or until a match (or non-match) is found with AL/AX/EAX or with DS:[(E)SI]. This can be used to implement some types of searches and operations on null-terminated strings. A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time said table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records. An index is a copy of selected columns of data, from a table, that is designed to enable very efficient search. An index normally includes a "key" or direct link to the original row of data from which it was copied, to allow the complete row to be retrieved efficiently. Some databases extend the power of indexing by letting developers create indexes on column values that have been transformed by functions or expressions. For example, an index could be created on upper(last_name), which would only store the upper-case versions of the last_name field in the index. Another option sometimes supported is the use of partial index, where index entries are created only for those records that satisfy some conditional expression. A further aspect of flexibility is to permit indexing on user-defined functions, as well as expressions formed from an assortment of built-in functions. == Usage == === Support for fast lookup === Most database software includes indexing technology that enables sub-linear time lookup to improve performance, as linear search is inefficient for large databases. Suppose a database contains N data items and one must be retrieved based on the value of one of the fields. A simple implementation retrieves and examines each item according to the test. If there is only one matching item, this can stop when it finds that single item, but if there are multiple matches, it must test everything. This means that the number of operations in the average case is O(N) or linear time. Since databases may contain many objects, and since lookup is a common operation, it is often desirable to improve performance. An index is any data structure that improves the performance of lookup. There are many different data structures used for this purpose. There are complex design trade-offs involving lookup performance, index size, and index-update performance. Many index designs exhibit logarithmic (O(log(N))) lookup performance and in some applications it is possible to achieve flat (O(1)) performance. === Policing the database constraints === Indexes are used to police database constraints, such as UNIQUE, EXCLUSION, PRIMARY KEY and FOREIGN KEY. An index may be declared as UNIQUE, which creates an implicit constraint on the underlying table. Database systems usually implicitly create an index on a set of columns declared PRIMARY KEY, and some are capable of using an already-existing index to police this constraint. Many database systems require that both referencing and referenced sets of columns in a FOREIGN KEY constraint are indexed, thus improving performance of inserts, updates and deletes to the tables participating in the constraint. Some database systems support an EXCLUSION constraint that ensures that, for a newly inserted or updated record, a certain predicate holds for no other record. This can be used to implement a UNIQUE constraint (with equality predicate) or more complex constraints, like ensuring that no overlapping time ranges or no intersecting geometry objects would be stored in the table. An index supporting fast searching for records satisfying the predicate is required to police such a constraint. == Index architecture and indexing methods == === Non-clustered === The data is present in arbitrary order, but the logical ordering is specified by the index. The data rows may be spread throughout the table regardless of the value of the indexed column or expression. The non-clustered index tree contains the index keys in sorted order, with the leaf level of the index containing the pointer to the record (page and the row number in the data page in page-organized engines; row offset in file-organized engines). In a non-clustered index, The physical order of the rows is not the same as the index order. The indexed columns are typically non-primary key columns used in JOIN, WHERE, and ORDER BY clauses. There can be more than one non-clustered index on a database table. === Clustered === Clustering alters the data block into a certain distinct order to match the index, resulting in the row data being stored in order. Therefore, only one clustered index can be created on a given database table. Clustered indexes can greatly increase overall speed of retrieval, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items is selected. Since the physical records are in this sort order on disk, the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them. Some databases separate the data and index blocks into separate files, others put two completely different data blocks within the same physical file(s). === Cluster === When multiple databases and multiple tables are joined, it is called a cluster (not to be confused with clustered index described previously). The records for the tables sharing the value of a cluster key shall be stored together in the same or nearby data blocks. This may improve the joins of these tables on the cluster key, since the matching records are stored together and less I/O is required to locate them. The cluster configuration defines the data layout in the tables that are parts of the cluster. A cluster can be keyed with a B-tree index or a hash table. The data block where the table record is stored is defined by the value of the cluster key. == Column order == The order that the index definition defines the columns in is important. It is possible to retrieve a set of row identifiers using only the first indexed column. However, it is not possible or efficient (on most databases) to retrieve the set of row identifiers using only the second or greater indexed column. For example, in a phone book organized by city first, then by last name, and then by first name, in a particular city, one can easily extract the list of all phone numbers. However, it would be very tedious to find all the phone numbers for a particular last name. One would have to look within each city's section for the entries with that last name. Some databases can do this, others just won't use the index. In the phone book example with a composite index created on the columns (city, last_name, first_name), if we search by giving exact values for all the three fields, search time is minimal—but if we provide the values for city and first_name only, the search uses only the city field to retrieve all matched records. Then a sequential lookup checks the matching with first_name. So, to improve the performance, one must ensure that the index is created on the order of search columns. == Applications and limitations == Indexes are useful for many applications but come with some limitations. Consider the following SQL statement: SELECT first_name FROM people WHERE last_name = 'Smith';. To process this statement without an index the database software must look at the last_name column on every row in the table (this is known as a full table scan). With an index the database simply follows the index data structure (typically a B-tree) until the Smith entry has been found; this is much less computationally expensive than a full table scan. Consider this SQL statement: SELECT email_address FROM customers WHERE email_address LIKE '%@wikipedia.org';. This query would yield an email address for every customer whose email address ends with "@wikipedia.org", but even if the email_address column has been indexed the database must perform a full index scan. This is because the index is built with the assumption that words go from left to right. With a wildcard at the beginning of the search-term, the database software is unable to use the underlying index data structure (in other words, the WHERE-clause is not sargable). This problem can be solved through the addition of another index created on reverse(email_address) and a SQL query like this: SELECT email_address FROM customers WHERE reverse(email_address) LIKE reverse('%@wikipedia.org');. This puts the wild-card at the right-most part of the query (now gro.aidepikiw@%), which the index on reverse(email_address) can satisfy. When the wildcard characters are used on both sides of the search word as %wikipedia.org%, the index available on this field is not used. Rather only a sequential search is performed, which takes O ( N ) {\displaystyle Audience capture is the phenomenon where an influencer is affected by their audience, catering to it with what they believe it wants to hear or is willing to pay for. This creates a positive feedback loop, which can lead the influencer to express more extreme views and behaviors. A famous example of audience capture can be found in the story of the online influencer Nicholas Perry, known as Nikocado Avocado. Perry started off on YouTube with videos of himself playing the violin and supporting veganism. He then shifted to videos of himself eating known as mukbang. Audience capture led him to more and more extreme eating leading him in turn to obesity and poor health. The effect can cause ideological media creators to become more politically radical, based on the feedback of their audience.Audience capture
Content reference identifier
Zero-overhead looping
Database index
Audience capture