The Assimilation Project  based on Assimilation version 1.1.7.1474836767
GettingStarted.c
Go to the documentation of this file.
1 assimilation > /etc/ld.so.conf.d/assimilation.conf
282  ldconfig /usr/lib/*/assimilation
283 </pre>
284 or a slightly different set of commands depending on where we install our libraries.
285 
286 @subsection TestifyTests testify tests
287 There are a large number of tests performed on the Python code
288 including the CMA code with database.
289 The project runs these tests before updating the master source
290 control instance on <i>hg.linux-ha.org</i>.
291 These regression tests also significantly exercises the C code underlying the
292 python code, and the interfaces between the two bodies of code.
293 These tests bind to port 1984, so some of them will fail if port 1984 is not available.
294 
295 To run these tests, execute these steps:
296 - <tt>cd <i><source-code-directory></i>/cma</tt>
297 - <tt>testify tests</tt>
298 
299 The final line should look something like this:
300 <pre>PASSED. 74 tests / 22 cases: 74 passed, 0 failed. (Total test time 172.75s)</pre>
301 
302 
303 @subsection ValgrindTest testcode/grind.sh test
304 This code is a pure-C test which exercises the nanoprobe code with
305 a simulated CMA. It is run under valgrind to look for memory leaks
306 outside our object system (those are noted automatically).
307 There is a variety of hard-coded IP addresses used in these tests.
308 This can be ignored for the time being.
309 However, the test binds to port 1984, so it will fail if port 1984 is not available.
310 
311 This test is now run automatically by Testify - so see the section above for how to run it.
312 
313 Various things in the glib2 library do not free all their memory at exit, so
314 it is possible (likely?) that you will see things not being freed that are harmless.
315 Nevertheless, please report them to the mailing list.
316 
317 Normal output looks something like this:
318 <pre>
319 ** Message: Our OS supports dual ipv4/v6 sockets. Hurray!
320 ** Message: Joining multicast address.
321 ** Message: multicast join succeeded.
322 ** Message: CMA received startup message from nanoprobe at address [::1]:1984/1984.
323 ** Message: PARSED JSON: {"source":"netconfig","discovertype":"netconfig","description":"IP Network Configuration","host":"servidor","data":{"virbr0":{"ipaddrs":{"192.168.122.1/24":{"brd":"192.168.122.255","scope":"global","name":"virbr0"}},"mtu":1500,"address":"fe:54:00:9e:57:e7","carrier":true,"operstate":"up"},"vnet0":{"ipaddrs":{"fe80::fc54:ff:fe9e:57e7/64":{"scope":"link"}},"mtu":1500,"address":"fe:54:00:9e:57:e7","carrier":true,"duplex":"full","operstate":"unknown","speed":10},"eth0":{"default_gw":true,"duplex":"full","carrier":true,"speed":1000,"address":"00:1b:fc:1b:a8:73","ipaddrs":{"10.10.10.5/24":{"brd":"10.10.10.255","scope":"global","name":"eth0"},"fe80::21b:fcff:fe1b:a873/64":{"scope":"link"}},"operstate":"up","mtu":1500},"lo":{"ipaddrs":{"127.0.0.1/8":{"scope":"host","name":"lo"},"::1/128":{"scope":"host"}},"mtu":16436,"address":"00:00:00:00:00:00","carrier":true,"operstate":"unknown"}}}
324 ** Message: 1 JSON strings parsed. 0 errors.
325 ** Message: Connected to CMA. Happiness :-D
326 ** Message: CMA Received switch discovery data (type 31) over the 'wire'.
327 ** (process:4565): WARNING **: Peer at address 10.10.10.4:1984 is dead (has timed out).
328 ** Message: CMA Received dead host notification (type 26) over the 'wire'.
329 ** Message: QUITTING NOW! (heartbeat count)
330 ** (process:4565): WARNING **: _fsprotocol_send1.855: Attempt to send FrameSet while link shutting down - FrameSet ignored.
331 ** (process:4565): WARNING **: _fsproto_fsa.222: Got a 5 input for [::1]:1984/0 while in state 0
332 ** (process:4565): WARNING **: _fsproto_fsa.225: Frameset given was: FrameSet(fstype=17, [[{SignFrame object at 0x0x6121da0}], [SeqnoFrame(type=4, (272464917,0,7))], [{Frame object at 0x0x6121f40}]])
333 ** Message: Count of 'other' pkts received: 2
334 ** Message: No objects left alive. Awesome!
335 </pre>
336 
337 The <i>CMA Received switch discovery data</i> message will not occur unless the OS
338 you're running it on has a NIC directly connected to an LLDP-equipped switch
339 (CDP is not yet fully supported).
340 
341 You may also get a variety of debug messages, depending on the version of glib2 you have
342 installed. This code has a lot of debug enabled, but later versions of glib2 suppress
343 debug messages unless the the environment variable G_MESSAGES_DEBUG is set to <b>all</b>.
344 
345 @subsection PingerTest testcode/pinger
346 The pinger program exercises the reliable UDP retransmission code in the
347 project. It is hard-wired to use port 19840. Hope that works for you.
348 It sends a number of packets with 5% simulated packet reception loss
349 and 5% simulated packet transmission loss. This is 9.75% overall packet loss rate.
350 
351 This test is now automatically run as part of the testify tests, so see the testify section above.
352 
353 Because it has a lot of debug enabled, debug might or might not come out by default
354 depending on what version of glibc2 you have installed.
355 Nevertheless, at the end you should be able to see these messages:
356 <pre>
357 Received a PING packet (seq 7) from [::1]:19840 ========================
358 Sending a PONG(2)/PING set to [::1]:19840
359 Received a PONG packet from [::1]:19840
360 Received a PONG packet from [::1]:19840
361 Received a PING packet (seq 8) from [::1]:19840 ========================
362 Sending a PONG(2)/PING set to [::1]:19840
363 Received a PONG packet from [::1]:19840
364 ** Message: _netio_recvapacket: Threw away 66 byte input packet
365 ** Message: _netio_recvapacket: Threw away 74 byte input packet
366 Received a PONG packet from [::1]:19840
367 Received a PING packet (seq 9) from [::1]:19840 ========================
368 Sending a PONG(2)/PING set to [::1]:19840
369 Received a PONG packet from [::1]:19840
370 Received a PONG packet from [::1]:19840
371 ** Message: Shutting down on ping count.
372 Received a PING packet (seq 10) from [::1]:19840 ========================
373 Sending a PONG(2)/PING set to [::1]:19840
374 Received a PONG packet from [::1]:19840
375 Received a PONG packet from [::1]:19840
376 ALL CONNECTIONS SHUT DOWN! calling g_main_quit()
377 ** Message: No objects left alive. Awesome!
378 </pre>
379 
380 Because the packet loss is random, the various <i>Threw away...</i> messages
381 will likely be in different places.
382 But it <i>should</i> stop, and it should end with the <i>Awesome!</i> message.
383 
384 
385 @section ConfiguringTheServices Configuring the Services
386 There is currently no configuration needed for these systems
387 under most circumstances. If your network does not support multicast,
388 then you will have to invoke the nanoprobes with an argument
389 specifying the address of the CMA.
390 By default communication takes place on UDP port 1984.
391 If port 1984 is not available to the nanoprobes, it will bind to an ephemeral port.
392 
393 This happens every time when starting a nanoprobe on the CMA -
394 since the CMA has already bound to that port.
395 
396 @section DealingWithFirewalls Dealing With Firewalls
397 Some systems (RHEL for example) come configured out of the box
398 with a default iptables configuration which will block our communication.
399 As you might guess, things don't work too well under those circumstances.
400 
401 To write firewall (iptables) rules to allow our communication, it is necessary to understand how
402 the Assimilation code communicates.
403 All our communication uses the UDP protocol.
404 The CMA and all the nanoprobes <i>except the one running on the CMA machine</i> default to
405 using UDP port 1984. However, both the nanoprobe and the CMA can't use
406 port 1984 at the same time, so if the nanoprobe can't use it's requested port,
407 it will use an ephemeral port.
408 As long as there is only one system using an ephemeral port, then all communication
409 will have either a source or a destination port of 1984.
410 
411 For this (normal) case, the following firewall rules will allow our software to communicate.
412 <pre>
413 -A INPUT -m udp -p udp --dport 1984 -j ACCEPT
414 -A INPUT -m udp -p udp --sport 1984 -j ACCEPT
415 </pre>
416 For non-CMA machines, they should only need the first rule.
417 The CMA will need both rules added to its rule set if iptables would otherwise block the communication.
418 
419 @section ActivatingTheServices Activating The Services
420 As of this writing, the packages we install do not activate the services,
421 so you will need to activate them manually. Sorry :-(
422 
423 @subsection StartingNeo4j Starting the Neo4j Database
424 - <tt>service neo4j-service start</tt>
425 
426 @subsection StartingAssimilationCode Starting the Assimilation Code
427 Keep in mind that you need to install and start nanoprobes on every machine,
428 but you should only to start the <i>cma</i> service on one machine.
429 
430 On Debian-based systems:
431 - <tt>/usr/sbin/update-rc.d nanoprobe defaults</tt>
432 - <tt>/usr/sbin/update-rc.d cma defaults</tt>
433 - <tt>service cma start</tt>
434 - <tt>service nanoprobe start</tt>
435 
436 On SuSE systems
437 - <tt>insserv nanoprobe</tt>
438 - <tt>insserv cma</tt>
439 - <tt>service cma start</tt>
440 - <tt>service nanoprobe start</tt>
441 
442 On RedHat systems
443 - <tt>chkconfig --add nanoprobe</tt>
444 - <tt>chkconfig --add cma</tt>
445 - <tt>service cma start</tt>
446 - <tt>service nanoprobe start</tt>
447 
448 On LSB-compliant systems
449 - <tt>/usr/lib/lsb/install_initd nanoprobe</tt>
450 - <tt>/usr/lib/lsb/install_initd cma</tt>
451 - <tt>service cma start</tt>
452 - <tt>service nanoprobe start</tt>
453 
454 If for some reason while playing around, you need to reinitialize the database, then next time
455 start the CMA with the --erasedb flag.
456 
457 @section ReadingTheLogs Reading System Logs
458 The nanoprobe code and the CMA code operate as normal daemons.
459 That is, they put themselves in the background and everything worth knowing is put into
460 the system logs.
461 
462 Below are a few interesting messages which you can expect to see,
463 along with explanations of what they mean.
464 
465 @subsection CMAStartUpMessages CMA startup messages
466 <pre>
467 Mar 3 14:20:45 servidor cma INFO: Listening on: 0.0.0.0:1984
468 Mar 3 14:20:45 servidor cma INFO: Requesting return packets sent to: 10.10.10.5:1984
469 Mar 3 14:20:45 servidor cma INFO: Starting CMA version 0.1.0 - licensed under The GNU General Public License Version 3
470 </pre>
471 
472 The CMA has started up, is listening to ANY port 1984, and is telling nanoprobes to send their
473 packets to address 10.10.10.5, port 1984. Currently messages printed from the CMA by the C code have the process id
474 in them, and messages from the python code do not.
475 
476 @subsection NanoprobeStartUpMessages Nanoprobe Startup Messages
477 <pre>
478 Mar 3 14:23:14 servidor nanoprobe[17660]: INFO: CMA address: 224.0.2.5:1984
479 Mar 3 14:23:14 servidor nanoprobe[17660]: INFO: Local address: [::]:45714
480 Mar 3 14:23:14 servidor nanoprobe[17660]: INFO: Starting version 0.1.0: licensed under The GNU General Public License Version 3
481 Mar 3 14:23:17 servidor cma INFO: Drone servidor registered from address [\::ffff:10.10.10.5]:45714 (10.10.10.5:45714)
482 Mar 3 14:23:17 servidor nanoprobe[17660]: NOTICE: Connected to CMA. Happiness :-D
483 Mar 3 14:23:19 servidor cma INFO: Stored arpcache JSON data from servidor without processing.
484 Mar 3 14:23:20 servidor cma INFO: Stored cpu JSON data from servidor without processing.
485 Mar 3 14:23:21 servidor cma INFO: Stored OS JSON data from servidor without processing.
486 </pre>
487 This means that a nanoprobe has stared up, process id 17660, and will try and locate the CMA by sending a packet
488 to the (multicast) address 224.0.2.5, port 1984.
489 It is listening to packets sent to ANY address, port 45714.
490 The port used is 45714 instead of 1984 because this nanoprobe is on the same machine as the CMA,
491 which already bound to [::]:1984. Nanoprobes on other machines will normally show
492 <tt>Local address: [::]:1984</tt> instead.
493 The "Stored ... JSON data from ... without processing" messages mean that we received new (different)
494 information for this discovery module than we had in the database, and that we
495 just stored it.
496 These particular discovery items have no special actions taken when they arrive - they're just stored
497 in the database.
498 
499 @subsection NanoprobeShutdownMessages Nanoprobe Shutdown Messages
500 The messages below were the result of a <tt>service nanoprobe stop</tt> command.
501 <pre>
502 Mar 3 14:30:55 servidor nanoprobe[18879]: NOTICE: nanoprobe: exiting on SIGTERM.
503 Mar 3 14:30:55 servidor cma INFO: System servidor at [\::ffff:10.10.10.5]:45714 reports it has been gracefully shut down.
504 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of heartbeats: 0
505 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of deadtimes: 0
506 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of warntimes: 0
507 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of comealives: 0
508 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of martians: 0
509 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of LLDP/CDP pkts sent: 1
510 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of LLDP/CDP pkts received: 27
511 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of recvfrom calls: 28
512 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of pkts read: 13
513 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of framesets read: 13
514 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of sendto calls: 14
515 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of pkts written: 14
516 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of framesets written: 0
517 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of reliable framesets sent: 10
518 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of reliable framesets recvd: 2
519 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of ACKs sent: 3
520 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of ACKs recvd: 10
521 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: Count of 'other' pkts received: 0
522 Mar 3 14:30:55 servidor nanoprobe[18879]: INFO: No objects left alive. Awesome!
523 </pre>
524 
525 The nanoprobe announced it was exiting, the CMA acknowledged that the system was shutting down
526 gracefully, the nanoprobe then printed out various statistics, and finally ended
527 with the <i>Awesome!</i> message indicating no memory leaks were observed.
528 
529 @subsection NanoprobeCrashMessages Nanoprobe Crash Messages
530 The messages below occurred when the system running a nanoprobe or the nanoprobe itself crashed.
531 In this case, the nanoprobe process running on paul was killed with SIGKILL (kill -9) which simulates a server crash.
532 <pre>
533 Mar 11 12:27:27 servidor nanoprobe[6416]: WARN: Peer at address [\::ffff:10.10.10.16]:1984 is dead (has timed out).
534 Mar 11 12:27:27 servidor cma WARNING: DispatchHBDEAD: received [HBDEAD] FrameSet from [[\::ffff:10.10.10.5]:44782]
535 Mar 11 12:27:27 servidor cma INFO: Node paul has been reported as dead by address [\::ffff:10.10.10.5]:44782. Reason: HBDEAD packet received
536 Mar 11 12:27:28 servidor cma WARNING: DispatchHBDEAD: received [HBDEAD] FrameSet from [[\::ffff:10.10.10.2]:1984]
537 Mar 11 12:27:28 servidor cma INFO: Node paul has been reported as dead by address [\::ffff:10.10.10.2]:1984. Reason: HBDEAD packet received
538 </pre>
539 
540 Note that the dead node (paul) was reported as being dead from the two peers monitoring it.
541 Since one of paul's peers was the node running the CMA (servidor), the first message "<tt>Peer at address...is dead</tt>"
542 from the nanoprobe also appears in the logs on this machine.
543 
544 
545 @subsection CMACrashMessages CMA Crash Messages
546 If you should see the CMA misbehave, it will probably either disappear with a crash
547 (indicating a problem in the interfaces to the C code), or it will catch an
548 exception handling a message from a client.
549 The messages below are typical of what you would see should this unfortunate event occur:
550 <pre>
551 Mar 3 14:50:08 servidor cma CRITICAL: MessageDispatcher exception [Relationship direction must be an integer value] occurred while handling [HBDEAD] FrameSet from [\::ffff:10.10.10.2]:1984
552 Mar 3 14:50:08 servidor cma INFO: FrameSet Contents follows (1 lines):
553 Mar 3 14:50:08 servidor cma INFO: HBDEAD:{SIG: {SignFrame object at 0x0x1e56680}, pySeqNo(REQID: (0, 1)), IPPORT: IpPortFrame(13, [\::ffff:10.10.10.16]:1984), END: {Frame object at 0x0x1edd890}}
554 Mar 3 14:50:08 servidor cma INFO: ======== Begin HBDEAD Message Relationship direction must be an integer value Exception Traceback ========
555 Mar 3 14:50:08 servidor cma INFO: messagedispatcher.py.51:dispatch: self.dispatchtable[fstype].dispatch(origaddr, frameset)
556 Mar 3 14:50:08 servidor cma INFO: dispatchtarget.py.61:dispatch: deaddrone.death_report('dead', 'HBDEAD packet received', origaddr, frameset)
557 Mar 3 14:50:08 servidor cma INFO: droneinfo.py.269:death_report: hbring.HbRing.ringnames[ringname].leave(self)
558 Mar 3 14:50:08 servidor cma INFO: hbring.py.178:leave: relationships = drone.node.get_relationships('all', self.ournexttype)
559 Mar 3 14:50:08 servidor cma INFO: neo4j.py.1190:get_relationships: uri = self._typed_relationships_uri(direction, types)
560 Mar 3 14:50:08 servidor cma INFO: neo4j.py.1161:_typed_relationships_uri: raise ValueError("Relationship direction must be an integer value")
561 Mar 3 14:50:08 servidor cma INFO: ======== End HBDEAD Message Relationship direction must be an integer value Exception Traceback ========
562 </pre>
563 
564 This particular set of messages were caused by a mismatch between the CMA code and the version of the <i>py2neo</i> code.
565 Note the <b>CRITICAL: MessageDispatcher exception</b> message that started it all off.
566 The next line contains a dump of the message that triggered the falure, followed by
567 a stack trace formatted to be passably readable in syslog.
568 
569 
570 @section EnablingDebugging Enabling Debugging
571 Both the CMA and the nanoprobe process take a <b>-d</b> flag to increment the debug level
572 by one. Currently debug values between 1 and 5 produce increasing levels of detail.
573 In addition, while the nanoprobe code is running, its debug level can be modified
574 at run time by sending it signals. If you send it a <b>SIGUSR1</b> signal
575 the overall debug level will be raised by one. If you send it a <b>SIGUSR2</b> signal,
576 the overall debug level will be lowered by one - unless it is already at zero, in which
577 case the <b>SIGUSR2</b> will be ignored.
578 
579 @section ExaminingNeo4j Examining the Neo4j database
580 Neo4j comes with an Administrative web server for examining various aspects of the database.
581 It can be reached at <a href="http://localhost:7474/webadmin/">http://localhost:7474/webadmin/</a>.
582 
583 The tabs you should find there include:
584 
585  - <b>Overview Dashboard </b> - provides an overview of the number of nodes, relationships and properties over time
586  - <b>Explore and edit</b> - Visual Data browser - visually display the result of a <a href="http://www.neo4j.org/learn/cypher">Cypher</a> query. It's also interactive, to allow to arrange nodes on the screen.
587  - <b>Power tool Console</b>A low-level shell-like language for exploring the database.
588  Can also be invoked as <tt>neo4j-shell</tt> from the command line.
589  - <b>Add and remove indexes</b> - you probably don't want to do this
590  - <b>Server Info</b> - information about how this Neo4j server is configured
591 @section CoolCypherQueries A few Cool Cypher queries
592 Below you'll find a number of useful and interesting Cypher queries which you
593 can issue from the Neo4j Administrative web server mentioned above, or you can
594 embed them in your programs.
595 The list below is far from exhaustive, but should be sufficient to give a few ideas
596 of the types of things that can be done.
597 
598 In order to fully appreciate the kinds of queries that one might perform, it is
599 necessary to understand Assimilation Project's Neo4j schema.
600 This schema was outlined in a number of blog postings - relating to the overall
601 <a href="http://techthoughts.typepad.com/managing_computers/2012/08/an-assimilation-type-schema-in-neo4j.html">nodetype schema</a>,
602 <a href="http://techthoughts.typepad.com/managing_computers/2012/07/neo4j-server-schema-for-the-assimilation-project.html">Servers and IP addresses</a>,
603 <a href="http://techthoughts.typepad.com/managing_computers/2012/07/assimilation-ring-neo4j-schema.html">rings</a>,
604 <a href="http://techthoughts.typepad.com/managing_computers/2012/07/discovering-switches-its-amazing-what-you-can-learn-just-by-listening.html">switches and switch connections</a>,
605 and lastly
606 <a href="http://techthoughts.typepad.com/managing_computers/2012/07/clients-servers-and-dependencies-oh-my.html">clients, servers and dependencies"</a>.
607 
608 
609 
610 @subsection GetTheServerList Retrieve The List of Servers
611 <pre>
612 START root=node(0)
613 MATCH drone-[:IS_A]->type-[:IS_A]->root
614 WHERE type.name = "Drone"
615 RETURN drone
616 </pre>
617 This will bring up a table with the nodes in the graph for servers (Drones) in the database.
618 If you click on any of the items in the graph, it will show all the basic properties for a server.
619 This list should include these items:
620  - <tt>port</tt>: - the port the nanoprobe is listening on
621  - <tt>nodetype</tt>: "Drone"
622  - <tt>status</tt>: - "up" or "down"
623  - <tt>reason</tt>: - the reason for the last status update
624  - <tt>name</tt>: hostname
625  - <tt>iso8601</tt>: time of last status update in ISO 8601 format - <i>will probably go away in the future</i>
626  - <tt>statustime</tt>: statustime - time of last status update - millseconds since 00:00 Jan 1, 1970 EST (the UNIX epoch)
627  - <tt>JSON_arpcache</tt>: JSON from ARP cache discovery
628  - <tt>JSON_cpu</tt>: JSON from cpu discovery
629  - <tt>JSON_netconfig</tt>: JSON from network configuration discovery
630  - <tt>JSON_OS</tt>: JSON from OS discovery
631  - <tt>JSON_tcpclients</tt>: JSON from the tcpclients discovery
632  - <tt>JSON_tcplisteners</tt>: JSON from the tcplisteners discovery
633  - <tt>JSON_\#LinkDiscovery</tt>: JSON from link discovery (if you have an LLDP-equipped switch)
634 
635 If you just want the list of host names, you can use this very similar query:
636 <pre>
637 START typeroot=node(0)
638 MATCH drone-[:IS_A]->nodetype-[:IS_A]->typeroot
639 WHERE nodetype.name = "Drone"
640 RETURN drone.name
641 </pre>
642 
643 @subsection GetDownServers Retrieve The List of Down Servers
644 The query below returns the set of servers which are currently marked down - regardless of the reason
645 they're down.
646 <pre>
647 START typeroot=node(0)
648 MATCH drone-[:IS_A]->nodetype-[:IS_A]->typeroot
649 WHERE nodetype.name = "Drone" and drone.status = "dead"
650 RETURN drone
651 </pre>
652 
653 @subsection GetShutDownServers Retrieve The List of Gracefully Shut Down Servers
654 The query below returns the set of servers which are currently down and were shut down gracefully.
655 <pre>
656 START typeroot=node(0)
657 MATCH drone-[:IS_A]->nodetype-[:IS_A]->typeroot
658 WHERE nodetype.name = "Drone" and drone.status = "dead" and drone.reason = "HBSHUTDOWN"
659 RETURN drone
660 </pre>
661 
662 @subsection GetCrashedServers Retrieve The List of Crashed Servers
663 The query below returns the set of servers which are down but were <b>not</b> shut down gracefully
664 (i.e., they crashed).
665 <pre>
666 START typeroot=node(0)
667 MATCH drone-[:IS_A]->nodetype-[:IS_A]->typeroot
668 WHERE nodetype.name = "Drone" and drone.status = "dead" and drone.reason <> "HBSHUTDOWN"
669 RETURN drone
670 </pre>
671 
672 @subsection GetCrashedServersWithTimes Retrieve The List of Crashed Servers and When They Crahsed
673 The query below returns the set of servers which are down but were <b>not</b> shut down gracefully
674 (i.e., they crashed) - with the systems that have been down the longest first.
675 <pre>
676 START typeroot=node(0)
677 MATCH drone-[:IS_A]->nodetype-[:IS_A]->typeroot
678 WHERE nodetype.name = "Drone" and drone.status = "dead" and drone.reason <> "HBSHUTDOWN"
679 RETURN drone, drone.iso8601
680 ORDER BY drone.iso8601
681 </pre>
682 
683 @subsection GetNICConnections Return which server NICs are connected to which switch NICs
684 The query below will return which switch ports are connected to which server ports, along with the
685 SystemName of the switch, and the description of the switch port.
686 As of the current release, this query will only produce results if you have LLDP data available to your servers.
687 <pre>
688 START typeroot=node(0)
689 MATCH switch<-[:nicowner]-switchnic-[:wiredto]-dronenic-[:nicowner]->drone-[:IS_A]->nodetype-[:IS_A]->typeroot
690 WHERE nodetype.name = "Drone"
691 RETURN drone.name, dronenic.nicname, switch.SystemName, switchnic.nicname, switchnic.PortDescription
692 </pre>
693 This should produce output which looks something like this:
694 <pre>
695 <b>drone.name dronenic.nicname switch.SystemName switchnic.nicname switchnic.PortDescription</b>
696 servidor eth0 GS724T_10_10_10_250 g6 Alan's office - North wall, grey jack
697 </pre>
698 
699 @subsection GetRingMembers Return which servers are members of a given ring
700 The query below will return all the systems which are in the given ring (in this case "The_One_Ring").
701 For the current state of only one ring, this is another way to get the list of all up servers.
702 <pre>
703 START Ring=node:Ring(Ring="The_One_Ring")
704 MATCH Ring<-[RingMember_The_One_Ring]-Drone
705 RETURN Drone
706 </pre>
707 
708 @subsection GetOrderedRingMembers Return servers on a ring with a node, in the order they appear on the ring
709 The query below will follow the <i>RingNext</i> links from the given node around the ring
710 until they return to the initial node. It's kind of a funky little query...
711 <pre>
712 START Drone=node:Drone(Drone="drone000001")
713 MATCH Drone-[:RingNext_The_One_Ring*]->NextDrone
714 RETURN NextDrone.name, NextDrone
715 </pre>
716 The results should loke something like this:
717 <pre>
718 "drone000002" [Node 31258]
719 "drone000003" [Node 31261]
720 "drone000004" [Node 31264]
721 "drone000005" [Node 31267]
722 "drone000001" [Node 31255]
723 </pre>
724 Note that this query returns the initial node <i>last</i>.
725 Also note that the Neo4j people claim you shouldn't rely on the results being
726 returned in the order you want them to be. But this does seem to work...
727 
728 @subsection EvenMoreQueries Even More Cool Cypher Queries
729 These queries don't begin to scratch the surface of what you can do with the Assimilation
730 Monitoring project and Cypher queries into the Neo4j database.
731 So, now it's up to you!
732 
733 Go forth, create even more Cool Cypher queries, and share them with everyone on the Assimilation
734 <a href="http://lists.community.tummy.com/cgi-bin/mailman/listinfo/assimilation">mailing list</a>.
735 
736 The CMA code has a collection of canned queries. You can read
737 these queries along with some metadata about them by looking at
738 the source files you find
739 <a href="http://hg.linux-ha.org/assimilation/file/tip/queries">here</a>.
740 
741 @section UnInstalling Un-installing
742 If you wish to uninstall the software, and you installed it as packages, please use the mechanism
743 that comes with the packaging for your operating system.
744 
745 If you installed it with <tt>sudo make install</tt>, then there should be a file named
746 <tt>install_manifest.txt</tt> in the top directory of your build directory that lists all the
747 files that were installed. Removing the files listed in that file should remove all the
748 installed files.
749 
750 @section GettingStartedConclusion Conclusion
751 If you have executed all these steps, and everything has worked, then congratulations, everything is working!
752 Please let the <a href="http://lists.community.tummy.com/cgi-bin/mailman/listinfo/assimilation">mailing list</a> know!
753 
754 If it didn't work for you, it's <i>even more</i> important to let the mailing list know.
755 
756 */