To understand attention mechanism, let us first take a look at the concept of attention.
The concept of attention in deep learning (DL) emerged from the applications of NLP coupled with machine translation.
Alex Graves, a lead AI research scientist at DeepMind, the renowned AI research collective, indirectly defined attention.
According to his lecture at DeepMind in 2020,
So attention arose from a set of real-world AI Development instances that have something to do with time-varying data.
In terms of machine learning concepts, such collections of data are known as sequences.
The earliest machine learning model problem that the attention mechanism derives its concepts from is known as the Sequence to Sequence (Seq2Seq) learning model.
Let us now understand how does the attention mechanism work.
Attention is one of the most researched concepts in the domain of deep learning for problems such as neural machine translation and image captioning.
There are certain supporting concepts that help better explain the attention mechanism idea as a whole, such as Seq2Seq models, encoders, decoders, hidden states, context vectors, and so on. (we’ll be exploring all these in detail in our upcoming posts)
When defining attention in simple terms, it refers to focusing on a certain component of the input problem and taking greater notice of it.
The DL-based attention mechanism is also based on directing your focus, and paying greater attention to specific factors of a problem when processing data relevant to the problem.
Let us consider a sentence in English: "I hope you are doing well". Our goal is to translate this sentence into Spanish.
So, while the input sequence is the English sentence, the output sequence is supposed to be "Espero que lo estás haciendo bien".
For each word in the output sequence, the attention mechanism maps the relevant words in the input sequence. So, "Espero" in the output sequence will be mapped to "I hope" in the input sequence.
Higher 'weights' or relevance are assigned to input sequence words in relation to the appropriate words in the output sequence.
The accuracy of output prediction is enhanced by doing this as the attention model is more capable of producing relevant output.
AI already creates software, hardware is next. Several companies like Circuitmind, Cells and JITX are starting to use AI to do hardware design.
Voxels vs Polygons I like this paper on generating 3D voxels based objects: https://alexzhou907.github.io/pvd As compared to polygon based models, I think voxels are a more accurate way of modeling actual 3D objects. Also seems closer to how 3D printing would work.