CS404 Artifical Intelligence (NUIM)

Artificial Intelligence & Language Processing CS404 at National University of Ireland, Maynooth

Artificial Intelligence & Language Processing CS404 at National University of Ireland, Maynooth

Kartei Details

Karten 43
Sprache English
Kategorie Informatik
Stufe Universität
Erstellt / Aktualisiert 20.05.2015 / 20.05.2015
Weblink
https://card2brain.ch/box/cs404_artifical_intelligence_nuim
Einbinden
<iframe src="https://card2brain.ch/box/cs404_artifical_intelligence_nuim/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

Document Retrieval (= Abfrage)

Queries.. ?!

Identify the top quality documents in a database that relate to some query:

  • Query Dependent: Evaluate document quality in relation to some given query
  • Query Independent (Page Rank): Evaluate document quality before (& independent of) any query terms. All ranking is done before any query arrives 

Measuring Retrieval Accuracy

Precision: the number of relevant documents retrieved, divided by the total number of documents retrieved. (=sind möglichst viele die ich bekomme relevant?)

 \(Precision = \frac{\#RelevantRetrieved}{\#TotalDocuments}\)

Recall: the number of relevant documents retrieved divided by the total number of relevant documents in the long-term memory (=bekomme ich möglichst alle relevanten?)

\(Recall = \frac{\#RelevantRetrieved}{\#TotalRelevant}\)

Document Ranking & Retrieval (What is it for? Citations, Examples)

  • For a database of Hyperlinked documents
  • Citations confer quality
  • Citations from highly referenced documents are even better
  • "Citation Ranking" of academic documents:
    • Google Scholar
    • Citeseer
    • Academic research microsoft

H - Index

  • H-index metric for Academic productivity and quality
  • h-index = h if you have published h papers, each of which has at least h citations

In-link vs. Out-link

  • out-link = backlink
  • Hyperlinks on a page are outlinks pointing to other pages on the web
  • An inlink to one page is an outlink from another page 
  • Link from A to B is an out-link from A but an in-link from B

 

Random Walk

  • Consider sombody randomly following links across the web (occasionally restarting at a random page)
  • He will frequently visit pages with many in-coming links (Pages with no in-link only get visited if hey actually starts at that page)
  • The summation of many random walks gives a likelihood of visiting that page \(\rightarrow\) This is the Google PageRank score!
  • Rank is based on theoretical visitors - not the number of people who actually visit a webpage

PageRank (formula!)

The PageRank of a page is the chance that a random surfer will lond on that page

\(PR(p) = (1-d) + d \cdot \sum_{(q,p) \in E} \frac{PR(q)}{outdegree(q)}\)

where PR = PageRank, d = 0.85, outdegree(q) = number of out-links contained on page q

Iterative Calculation of PageRank

  • Iterative Calculation of the true PageRank value: Begin by assuming all pages PR(x) = 0.15
  • Random iterative application of PageRank rule gradually "converges" to the final solution
  • Average Page Rank alue is 1, a few highly ranked pages, and vast numbers of poorly linked pages near 0.15 

Document Indexing

  • Documents are pre-ranked and pre-indexed by words
  • Google contains a document-word matrix, listing all webpages that use each word
  • When a query is supplied, Google finds the highest ranked page(s) that contain that term

tf-idf  Term-Document Relevance

  • tf-idf  = term frequency - inverse document frequency
  • The document-term index contains a relevance weight
    • indicating the relevance of a term to that document (0<r<1)
    • r large if term in: Title, bold, appears twice
    • Query Score = PageRank-value * relevance (tf-idf)
  • This allows 2 queries to return 2 pages, with "opposite" ranking
    • without relevance factor, documents could only be returned in a pre-determined order
  • The tf-idf value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general.

What is a Cellular Automata?

  • Simple computational machines
  • "Bottom up" approach provided
  • Behaviour "emerges" at higher leves: when looking at large clusters of cells

Components of a CA

1. states: e.g. 0 or 1

2. neighbourhood: the two adjacent (angrenzend) cells N C N

3. rules: depicts the new state of a cell for each (possible) local "configuration" of the cell and its two neighbours

1D Cellular Automata

If there are two possible states (0 or 1) for each of 3 cells, there are 2^3 = 8 rules required. And 2^8 = 256 different CA. 

 

CA - Game of Life - Rules (0 - 4)

0. Grid filled with "organisms", alive or dead

1. live cell < two live neighbours -> dies (loneliness)

2. live cell > three live neighbours -> dies (overcrowding)

3. live cell = two or three live neighbours -> lives

4. dead cell = three live neighbours -> comes to live

CA - Game of Life - Structures

  • Some structures die off
  • stable
    • block
    • boat
  • oscillate
    • blinker
  • move continually 
    • gliders (move across the grid)
  • more complex behaviour
    • glider guns (make gliders every n updates)

 

2D cellular automata

  • The most popular CA format
  • 2D grid
  • usually a 3x3 neighbourhood grid (in/excluding the corner squares)
  • Binary states (more states allowed in more complex CA)

Rabbits & Foxes

  • Simulate competitive populations
    • Simplification of producer/consumer
  • If there's few foxes, the rabbits thrive (=gedeihen) and breed rapidly
  • But then, the fox population soars (=steigt), consuming all rabbits
    • Until the foxes start to starve
  • Cells: Fox, rabbit, neutral

4 Types of CA - Name all types

  1. homogeneous state
  2. simple stable or periodic structures
  3. chaotic pattern
  4. long-lived, complx structures

4 Types of CA - Explain Class 1

Class 1: Homogeneous state: 

Cells die back completely after few steps of evolution. No initial configuration can produces patterns of any length using these rules, it will always evolve to a homogeneous state. These are non-reversible, because once the fild is blank, no previous state information can be derived.

4 Typles of CA - Explain Class 2

Class 2: Simple stable or periodic structures.

Self-contained structures can exist that will not terminate in any number of time steps. Another way of putting this is that the information of a cell can only propagate a finite distance, and thus cannot affect distant structures - regardless of time. This acts almost as a "filter" on the initial configuration, choosing some features to propagate over infinite steps, and some to remove completely. For a 1dCA this may look like solid bars over time on the automaton, or train tracks, stair steps, etc. 
Non-reversible just like class 1, since the fixed structures give no information about the initial state which was lost during evolution.

4 Types of CA - Explain Class 3

Class 3: Chaotic pattern

This is where cellular automata start to get interesting. The amount of current knowledge needed grows as one tries to predict further into the future - hence it is a chaotic system. As with all well-defined chaotic systems, a given state is completely reversible; the previous state (and all the ones before) can be predicted by examining the current state. non-deterministic - to find the value of a cell after an infinite amount of time, an infinite number of initial conditions must be considered. Behavior is chaotic, but not random.

4 Types of CA - Explain Class 4

Class 4: Long-lived, complex structures

These automata propagate cell information out at variable speeds, depending on the structures themselves. These are the only non-trivial, non-reversible automata; since the current cell values could have arisen from more than one previous configuration, current cell values cannot be extrapolated backwards to know previous configurations.
Finding the value of a given cell a certain numer of steps in the future is an intractable problem; except for special cases, it cannot be found without actually doing all the drawn-out computation. Conway's Game of Life is a class 4 cellular automaton.

Other Properties of CA

  • Binary vs. continuous CA
    • Binary, integer or Real cell states
  • Synchronous vs. asynchronous CA
    • Are all cells updated instantaneously, or one at a time (in a random order)
  • Totalistic CA
    • State depends only on total (average) of previous states (i.e. no left-to-right directionality)
  • Reversible CA
    • It's possible to return to previous states, runs forward and backward (in time)
  • Regular vs. irregular CA
    • Regular 2D grid vs. irregular tessellation of 2D plane. 

Firing Squad Problem

  • N riflemen (Schützen) standing next to each other in a line
  • Each can speak to, or hear, his immediate neighbours
  • Riflemaster issues command to first rifleman
  • Can they agree a signal system so that all riflemen fire simultaneously? 

One solution to the Firing Squad Problem using 15 states and 3n units of time. 

Intelligent Agents (CA)

  • Complex problems solved by interaction amongst (relatively) simple components (agents)
    • society of mind
  • Each agent performs some specific task, relatively autonomously
  • Problem solved in "bottom up" manner

Q: How does an individual loaf of bread get from the baker to your table? 
A: Individual agents (drivers) deliver quantities to shops. The shop-keeper (agent) stores bread for individual sale. You (agent) then buy a single loaf to deliver to your table! 

Conclusion CA

  • Cellular Automata are simple computational devices
    • Cells, states, update rules
  • "Intelligence" is an emergent property
    • like "agent" systems
  • CA can be seen as simple inteligent agents

Lazy vs. Eager Learning

  • Lazy: wait for query before "generalizing"
    • k-Neirest Neighbour, Case based Reasoning
  • Eager: generalize before seeing query
    • Radial basis function networks, ID3, Backpropagation, NaiveBayes,...
  • Eager creates global approximation - Lazy create many local approximations
  • If they use same H(ypothesis), lazy can represent more complex functions
    • e.g. consider H = linear functions

 

KNN - Basics

  • KNN is basically a two-part algorithm: 
    • Retrieve the K most similar recorded cases
    • "Average" across these K cases
  • Simplest case is where k=1

1 - Nearest Neighbour

  • Given a query instance xq, find the nearest neighbour xn and estimate f(xq) as f(xn
  • Find the most similar recorded problem 
    • Assume they have the same solutions
    • i.e. generalise across one point

KNN - Problem


 

Predicting f(xq)

  • Take "vote" among it's k nearest neighbours
    • if discrete-valued target function
  • Take a mean (Mittelwert) of f(x) values for the k nearest neighbours
    • if real-valued target function

When to use nearest Neighbour? 

  • Instances map to points in Rn
  • Less than 20 attributes per instance
  • When you've lots of training data

Advantages of Nearest Neighbour

  • "Training" is very fast
  • Learn complex target functions
  • Don't lose information
  • Constantly changing/updated database

Disadvantages of Nearest Neighbour

  • Slow at query time
  • Easily fooled by irrelevant attributes

Aris Process

  1. Representation
  2. Retrieval
  3. Graph Mapping
  4. Transfer spec# statements
  5. validate original code + new spec#

Compare and contrast mutation and crossover. Describe in detail the effects of one specific operator of each type. Use a diagram to illustrate your answer. 

Crossover: The production of offspring is frequently based on taking the first N chroosomes from parent one, and the remainder from parent two. This produces a new individual with some genetic material from each parent. 

Two-Point Crossover: You have two crossover points, so 3 strings of bits

 

Compare and contrast mutation and crossover. Describe in detail the effects of one specific operator of each type. Use a diagram to illustrate your answer. 

Mutation: The rate of mutation is determined by a global constant/variable which determines the probability of a random alteration of one bit of genetic material of an individual (This helps avoid local minima)

Single-point mutation: flipping one bit

What is a GA (Genetic Algorithm)? What is it used for?

  • model of machine learning which derives it's behavior from a metaphor of some of the mechanisms of Evolution, in nature
  • used for a number of different application areas, like multidimensional optimisation and search problems
  • parallel, non-comprehensive search for the global maximum of the graph
  • GAs represent a generate-and-test beam-search through hypothesis space
  • search is not precise (no guarantee that the global maximum will be found)
  • however, the result should be a good approximation of the maximum value

Steps of GA (Genetic Algorithm)

  1. Create a population of individuals randomly
  2. Evaluate fitness of each individual in the population
  3. Select Individuals
    1. Reproduction (Rank, Tournament)
    2. Crossover
    3. Mutation
  4. Repeat until a suitable fit individual is found

GA: Chromosome

  • Individuals within our universe represented by chromosomes - a set of character strings analogous to DNA
  • These chromosomes are used to represent the different parameters being optimised
  • in practice, we represent chromosomes by having arrays of bits