Last Updated on 2023-11-27 by Clay
Introduction
Neo4j is a graph database. Unlike traditional databases, the focus of a graph database is the "graph", which means the relationships and connections between nodes (entities). Each node can represent an object (such as people, things, places...), and edges represent the relationships between nodes (such as friends, owns, located at...)
Neo4j is a standout in the realm of graph databases, offering both community and enterprise editions. It allows us to flexibly establish complex data relationships and provides efficient querying and storage capabilities.
It's worth mentioning that Neo4j's query language is Cypher, a declarative language specifically designed for graph data structures, making data queries concise and intuitive.
With the rise of Large Language Models (LLM) today, not only has the RAG architecture become popular (refer to my previous article [Paper Reading] Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection), but the Graph RAG architecture has also been proposed to improve the response performance of LLMs.
Installation
Below is the installation process for Linux (Debian/Ubuntu distributions).
For other operating systems, refer to the official website: https://neo4j.com/docs/operations-manual/current/installation/
For other distributions, refer to: https://neo4j.com/docs/operations-manual/current/installation/linux/
The official documentation states that if you want to use a specific version of Java, you need to prepare the related environment before starting the installation; otherwise, the package manager will install OpenJDK by default.
wget -O - https://debian.neo4j.com/neotechnology.gpg.key | sudo apt-key add -
echo 'deb https://debian.neo4j.com stable latest' | sudo tee -a /etc/apt/sources.list.d/neo4j.list
sudo apt-get update
Afterward, you can use apt list -a neo4j
to check which versions are available.
sudo apt install neo4j=1:5.14.0
After installation, start configuring the neo4j service.
systemctl cat neo4j.service
systemctl start neo4j.service
Then, visit http://localhost:7474/browser/
Use :server connect
to connect to the database. The default password is neo4j. Remember to change it afterward.
Neo4j Basic Commands
CREATE
Create Node
CREATE
(p1:Person {name: "Clay", age: 28}),
(p2:Person {name: "Wendy", age: 23})
Create Relationship
MATCH (p1:Person {name: "Clay"}), (p2:Person {name: "Wendy"})
CREATE (p1)-[:KNOWS]->(p2)
MATCH
After creation, you can use MATCH (n) RETURN n
to view the current relationships.
MERGE
MERGE is similar to CREATE, but it does not create duplicate nodes if they already exist.
DELETE
To completely delete data, including relationships, use:
MATCH (n) DETACH DELETE n
WHERE
WHERE is a conditional statement, familiar to anyone who has learned SQL syntax. Here, I'm limiting the query to people older than 25 years.
MATCH (person: Person)
WHERE person.age > 25
RETURN person
SET
SET
command is used to update the attribute values or add new attributes to nodes/relationships.
MATCH (person:Person {name: "Wendy"})
SET person.age = 26
After updating, let's take another look at people over 25 years old.
MATCH (person: Person)
WHERE person.age > 25
RETURN person
REMOVE
Use REMOVE to delete data attributes or relationships.
MATCH (p1:Person {name: "Clay"})-[knows:KNOWS]->(p2:Person {name: "Wendy"})
DELETE knows
ORDER BY
Before sorting, first create many nodes with random values.
FOREACH (x IN RANGE(1, 30) |
CREATE (:RandomNode {value: rand()*100}))
First, let's see the ascending order:
MATCH (n:RandomNode)
RETURN n.value
ORDER BY n.value ASC
For descending order, simply change ASC to DESC.
MATCH (n:RandomNode)
RETURN n.value
ORDER BY n.value DESC
Using Python
First, you need to install the Neo4j driver support.
pip3 install neo4j
Then, you can use the following code for the first access.
from neo4j import GraphDatabase
class HelloWorldExample:
def __init__(self, uri, user, password):
self.driver = GraphDatabase.driver(uri, auth=(user, password))
def close(self):
self.driver.close()
def print_greeting(self, message):
with self.driver.session() as session:
greeting = session.execute_write(self._create_and_return_greeting, message)
print(greeting)
@staticmethod
def _create_and_return_greeting(tx, message):
result = tx.run("CREATE (a:Greeting) "
"SET a.message = $message "
"RETURN a.message + ', from node ' + id(a)", message=message)
return result.single()[0]
if __name__ == "__main__":
greeter = HelloWorldExample("bolt://localhost:7687", "neo4j", "")
greeter.print_greeting("hello, world")
greeter.close()
This is based on the example from the official website. You will need to fill in the password for the neo4j user.
Below is a version I rewrote for any command:
# coding: utf-8
from typing import Any
from neo4j import GraphDatabase
class HelloWorldExample:
def __init__(self, uri, user, password):
self.driver = GraphDatabase.driver(uri, auth=(user, password))
def close(self):
self.driver.close()
def run_command(self, command: str):
with self.driver.session() as session:
result = session.execute_write(self._run_command, command=command)
for node in result:
print(node)
@staticmethod
def _run_command(tx, command: str) -> Any:
result = tx.run(command)
return [record["n"] for record in result]
if __name__ == "__main__":
greeter = HelloWorldExample("bolt://localhost:7687", "neo4j", "")
greeter.run_command("MATCH (n) RETURN n")
greeter.close()
Output:
[Node element_ids and properties...]