概述
与经典的随机游走不同,Node2Vec游走是一种有偏的随机游走(Biased Random Walk),可以以广度优先(BFS)或深度优先(DFS)的方式探索节点邻域。详情请参考Node2Vec算法。
特殊说明
- 节点可以沿自环边游走。
- Node2Vec游走的结果与边的方向无关。
示例图集
创建示例图集:
// 在空图集中逐行运行以下语句
create().edge_property(@default, "score", float)
insert().into(@default).nodes([{_id:"A"},{_id:"B"},{_id:"C"},{_id:"D"},{_id:"E"},{_id:"F"},{_id:"G"},{_id:"H"},{_id:"I"},{_id:"J"},{_id:"K"}])
insert().into(@default).edges([{_from:"A", _to:"B", score:1}, {_from:"A", _to:"C", score:3}, {_from:"C", _to:"D", score:1.5}, {_from:"D", _to:"C", score:2.4}, {_from:"D", _to:"F", score:5}, {_from:"E", _to:"C", score:2.2}, {_from:"E", _to:"F", score:0.6}, {_from:"F", _to:"G", score:1.5}, {_from:"G", _to:"J", score:2}, {_from:"H", _to:"G", score:2.5}, {_from:"H", _to:"I", score:1}, {_from:"I", _to:"I", score:3.1}, {_from:"J", _to:"G", score:2.6}])
创建HDC图集
将当前图集全部加载到HDC服务器hdc-server-1
上,并命名为 hdc_node2vec_walk
:
CALL hdc.graph.create("hdc-server-1", "hdc_node2vec_walk", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_node2vec_walk", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
参数
算法名:random_walk_node2vec
参数名 |
类型 |
规范 |
默认值 |
可选 |
描述 |
---|---|---|---|---|---|
ids |
[]_id |
/ | / | 是 | 通过_id 指定随机游走的起点;若未设置则计算所有点 |
uuids |
[]_uuid |
/ | / | 是 | 通过_uuid 指定随机游走的起点;若未设置则计算所有点 |
walk_length |
Integer | ≥1 | 1 |
是 | 每次游走的深度,即访问的节点数量 |
walk_num |
Integer | ≥1 | 1 |
是 | 从每个指定节点开始的游走次数 |
p |
Float | >0 | 1 |
是 | 返回参数;数值越大,回走概率越小 |
q |
Float | >0 | 1 |
是 | 远近参数;数值大于1时,倾向于在同级游走,否则倾向于向远处游走 |
edge_schema_property |
[]"<@schema.?><property> " |
/ | / | 是 | 作为权重的数值类型边属性,权重值为所有指定属性值的总和;不包含指定属性的边将被忽略 |
return_id_uuid |
String | uuid , id , both |
uuid |
是 | 在结果中使用_uuid 、_id 或同时使用两者来表示点 |
limit |
Integer | ≥-1 | -1 |
是 | 限制返回的结果数;-1 返回所有结果 |
文件回写
CALL algo.random_walk_node2vec.write("hdc_node2vec_walk", {
params: {
return_id_uuid: "id",
walk_length: 6,
walk_num: 2,
p: 10000,
q: 0.0001
},
return_params: {
file: {
filename: "walks"
}
}
})
algo(random_walk_node2vec).params({
project: "hdc_node2vec_walk",
return_id_uuid: "id",
walk_length: 6,
walk_num: 2,
p: 10000,
q: 0.0001
}).write({
file:{
filename: 'walks'
}})
结果:
_ids
J,G,F,D,C,E,
D,C,A,B,A,C,
F,G,E,C,A,B,
H,I,I,H,G,F,
B,A,C,D,F,G,
A,B,A,B,A,C,
E,C,E,C,A,B,
K,
C,E,F,G,J,G,
I,I,H,G,F,E,
G,H,I,I,H,G,
J,G,F,D,C,E,
D,C,A,B,A,C,
F,E,C,D,F,E,
H,G,H,G,J,G,
B,A,C,D,F,G,
A,C,D,F,E,C,
E,C,E,C,A,B,
K,
C,A,B,A,C,D,
I,H,G,J,G,H,
G,H,I,I,H,G,
完整返回
CALL algo.random_walk_node2vec("hdc_node2vec_walk", {
params: {
return_id_uuid: "id",
ids: ['J'],
walk_length: 6,
walk_num: 3,
p: 2000,
q: 0.001
},
return_params: {}
}) YIELD walks
RETURN walks
exec{
algo(random_walk_node2vec).params({
return_id_uuid: "id",
ids: ['J'],
walk_length: 6,
walk_num: 3,
p: 2000,
q: 0.001
}) as walks
return walks
} on hdc_node2vec_walk
结果:
_ids |
---|
["J","G","F","D","C","E"] |
["J","G","J","G","F","D"] |
["J","G","J","G","H","I"] |
流式返回
CALL algo.random_walk_node2vec("hdc_node2vec_walk", {
params: {
return_id_uuid: "id",
ids: ['A'],
walk_length: 5,
walk_num: 10,
p: 1000,
q: 1,
edge_schema_property: 'score'
},
return_params: {
stream: {}
}
}) YIELD walks
RETURN walks
exec{
algo(random_walk_node2vec).params({
return_id_uuid: "id",
ids: ['A'],
walk_length: 5,
walk_num: 10,
p: 1000,
q: 1,
edge_schema_property: 'score'
}).stream() as walks
return walks
} on hdc_node2vec_walk
结果:
_ids |
---|
["A","C","A","D","C"] |
["A","C","A","C","A"] |
["A","C","A","D","A"] |
["A","C","A","C","A"] |
["A","C","A","D","E"] |
["A","C","A","D","E"] |
["A","C","A","B","A"] |
["A","C","A","D","A"] |
["A","C","A","C","D"] |
["A","C","A","C","A"] |