Neo4j的介绍可以参考这篇文章:《知识图谱技术与应用指南(转)》
其实,Python操作Neo4j,既可以用neo4j
也可以用py2neo
,前者是Neo4j官方的api,但是py2neo
开发地更早,已经到V4了。
官方文档地址:https://py2neo.org/v4/
0、安装
下载Neo4j:https://neo4j.com/download/
使用pip安装: pip install py2neo
从github源码安装:pip install git+https://github.com/technige/py2neo.git#egg=py2neo
1、数据类型
1.1 节点Node
和关系Relationship
对象
图数据库,最重要的就是节点、边和属性,py2neo
中最重要的就是类Node
和Relationship
:
from py2neo.data import Node, Relationship
a = Node("Person", name="Alice")
b = Node("Person", name="Bob")
ab = Relationship(a, "KNOWS", b)
print(ab)
# (Alice)-[:KNOWS]->(Bob)
如果没有指定节点之间的关系,则默认为TO
。也可以自建类Relationship
的子类,如下:
c = Node("Person", name="Carol")
class WorksWith(Relationship): pass
ac = WorksWith(a, c)
print(type(ac))
# 'WORKS_WITH'
1.2 子图Subgraph
对象
集合操作是创建子图最简便的方法:
s = ab | ac
print(s)
# {(alice:Person {name:"Alice"}), (bob:Person {name:"Bob"}), (carol:Person {name:"Carol"}), (Alice)-[:KNOWS]->(Bob), (Alice)-[:WORKS_WITH]->(Carol)}
print(s.nodes())
# frozenset({(alice:Person {name:"Alice"}), (bob:Person {name:"Bob"}), (carol:Person {name:"Carol"})})
print(s.relationships())
# frozenset({(Alice)-[:KNOWS]->(Bob), (Alice)-[:WORKS_WITH]->(Carol)})
1.3 路径Path
对象和可遍历Walkable
类型
可遍历对象是添加了遍历信息的子图。
w = ab + Relationship(b, "LIKES", c) + ac
print(w)
# (Alice)-[:KNOWS]->(Bob)-[:LIKES]->(Carol)<-[:WORKS_WITH]-(Alice)
1.4 记录Record
对象
Record
对象是值的有序有键的集合,和具名元组很像。
1.5 表格Table
对象
Table
对象是包含Record
对象的列表。
2 图数据库
Graph
对象是最重要的和Neo4j交互的类。
from py2neo import Graph
graph = Graph(password="password")
print(graph.run("UNWIND range(1, 3) AS n RETURN n, n * n as n_sq").to_table())
# n | n_sq
# -----|------
# 1 | 1
# 2 | 4
# 3 | 9
2.1 数据库Database
类
用于连接图数据库
from py2neo import Database
db = Database("bolt://camelot.example.com:7687")
默认值是bolt://localhost:7687
default_db = Database()
>>> default_db
<Database uri='bolt://localhost:7687'>
2.2 图Graph
Graph
类表示Neo4j中的图数据存储空间。
>>> from py2neo import Graph
>>> graph_1 = Graph()
>>> graph_2 = Graph(host="localhost")
>>> graph_3 = Graph("bolt://localhost:7687")
match
匹配:
for rel in graph.match((alice, ), r_type="FRIEND"):
print(rel.end_node["name"])
merge
融合:
>>> from py2neo import Graph, Node, Relationship
>>> g = Graph()
>>> a = Node("Person", name="Alice", age=33)
>>> b = Node("Person", name="Bob", age=44)
>>> KNOWS = Relationship.type("KNOWS")
>>> g.merge(KNOWS(a, b), "Person", "name")
再创建第三个节点:
>>> c = Node("Company", name="ACME")
>>> c.__primarylabel__ = "Company"
>>> c.__primarykey__ = "name"
>>> WORKS_FOR = Relationship.type("WORKS_FOR")
>>> g.merge(WORKS_FOR(a, c) | WORKS_FOR(b, c))
nodes
方法,找到所有符合条件的节点:
>>> graph = Graph()
>>> graph.nodes[1234]
(_1234:Person {name: 'Alice'})
>>> graph.nodes.get(1234)
(_1234:Person {name: 'Alice'})
>>> graph.nodes.match("Person", name="Alice").first()
(_1234:Person {name: 'Alice'})
2.3 事务Transactions
commit()
提交事务
create(subgraph)
创建节点和关系
>>> from py2neo import Graph, Node, Relationship
>>> g = Graph()
>>> tx = g.begin()
>>> a = Node("Person", name="Alice")
>>> tx.create(a)
>>> b = Node("Person", name="Bob")
>>> ab = Relationship(a, "KNOWS", b)
>>> tx.create(ab)
>>> tx.commit()
>>> g.exists(ab)
True
2.4 查询结果
Cursor
类
前进一个节点,打印节点的名字:
while cursor.forward():
print(cursor.current["name"])
因为Cursor是可迭代对象,也可以这样:
for record in cursor:
print(record["name"])
只关心一个节点,则:
if cursor.forward():
print(cursor.current["name"])
或:
print(next(cursor)["name"])
从单条记录只返回一个值:
print(cursor.evaluate())
data()
,提取出所有数据:
>>> from py2neo import Graph
>>> graph = Graph()
>>> graph.run("MATCH (a:Person) RETURN a.name, a.born LIMIT 4").data()
[{'a.born': 1964, 'a.name': 'Keanu Reeves'},
{'a.born': 1967, 'a.name': 'Carrie-Anne Moss'},
{'a.born': 1961, 'a.name': 'Laurence Fishburne'},
{'a.born': 1960, 'a.name': 'Hugo Weaving'}]
evaluate(field=0)
,从下条记录返回第一个字段:
>>> from py2neo import Graph
>>> g = Graph()
>>> g.run("MATCH (a) WHERE a.email={x} RETURN a.name", x="bob@acme.com").evaluate()
'Bob Robertson'
stats()
,返回查询统计:
>>> from py2neo import Graph
>>> g = Graph()
>>> g.run("CREATE (a:Person) SET a.name = 'Alice'").stats()
constraints_added: 0
constraints_removed: 0
contained_updates: True
indexes_added: 0
indexes_removed: 0
labels_added: 1
labels_removed: 0
nodes_created: 1
nodes_deleted: 0
properties_set: 1
relationships_created: 0
relationships_deleted: 0
to_data_frame(index=None, columns=None, dtype=None)
,将数据返回为pandas的DataFrame:
>>> from py2neo import Graph
>>> graph = Graph()
>>> graph.run("MATCH (a:Person) RETURN a.name, a.born LIMIT 4").to_data_frame()
a.born a.name
0 1964 Keanu Reeves
1 1967 Carrie-Anne Moss
2 1961 Laurence Fishburne
3 1960 Hugo Weaving
3 py2neo.matching
– 实体匹配
3.1 节点匹配
使用NodeMatcher
匹配节点:
>>> from py2neo import Graph, NodeMatcher
>>> graph = Graph()
>>> matcher = NodeMatcher(graph)
>>> matcher.match("Person", name="Keanu Reeves").first()
(_224:Person {born:1964,name:"Keanu Reeves"})
使用where
子句匹配:
>>> list(matcher.match("Person").where("_.name =~ 'K.*'"))
[(_57:Person {born: 1957, name: 'Kelly McGillis'}),
(_80:Person {born: 1958, name: 'Kevin Bacon'}),
(_83:Person {born: 1962, name: 'Kelly Preston'}),
(_224:Person {born: 1964, name: 'Keanu Reeves'}),
(_226:Person {born: 1966, name: 'Kiefer Sutherland'}),
(_243:Person {born: 1957, name: 'Kevin Pollak'})]
排序order_by()
和数量limit()
限制:
>>> list(matcher.match("Person").where("_.name =~ 'K.*'").order_by("_.name").limit(3))
[(_224:Person {born: 1964, name: 'Keanu Reeves'}),
(_57:Person {born: 1957, name: 'Kelly McGillis'}),
(_83:Person {born: 1962, name: 'Kelly Preston'})]
只统计数量,用len():
>>> len(matcher.match("Person").where("_.name =~ 'K.*'"))
6
3.2 关系匹配RelationshipMatcher
使用的方法和节点匹配很相似:
first()
order_by(*fields)
where(*conditions, **properties)
4 对象图映射Object-Graph Mapping
用于绑定Python对象和底层图数据
class Movie(GraphObject):
__primarykey__ = "title"
title = Property()
tag_line = Property("tagline")
released = Property()
actors = RelatedFrom("Person", "ACTED_IN")
directors = RelatedFrom("Person", "DIRECTED")
producers = RelatedFrom("Person", "PRODUCED")
class Person(GraphObject):
__primarykey__ = "name"
name = Property()
born = Property()
acted_in = RelatedTo(Movie)
directed = RelatedTo(Movie)
produced = RelatedTo(Movie)
4.1 图对象
GraphObject
,用作基类
4.2 属性Property()
>>> class Person(GraphObject):
... name = Property()
...
>>> alice = Person()
>>> alice.name = "Alice Smith"
>>> alice.name
"Alice Smith"
4.3 标签Label()
标签是布尔值,默认是False
>>> class Food(GraphObject):
... hot = Label()
...
>>> pizza = Food()
>>> pizza.hot
False
>>> pizza.hot = True
>>> pizza.hot
True
4.4 关联对象
class Person(GraphObject):
__primarykey__ = "name"
name = Property()
likes = RelatedTo("Person")
for friend in person.likes:
print(friend.name)
4.5 对象匹配
>>> Person.match(graph, "Keanu Reeves").first()
<Person name='Keanu Reeves'>
>>> list(Person.match(graph).where("_.name =~ 'K.*'"))
[<Person name='Keanu Reeves'>,
<Person name='Kevin Bacon'>,
<Person name='Kiefer Sutherland'>,
<Person name='Kevin Pollak'>,
<Person name='Kelly McGillis'>,
<Person name='Kelly Preston'>]
4.6 对象操作
>>> alice = Person()
>>> alice.name = "Alice Smith"
>>> graph.push(alice)
>>> alice.__node__
(_123:Person {name: 'Alice Smith'})