![]() ![]() Thing is as DevOps says, there's a variable MONIT_DESCRIPTION that in fact has the error string, but this variable is only "reachable" at bash environment.Īs I was doing: check file pd-error with path /var/log/testmonit.log Thanks for DevOps for put me on the rigth track to finish with this issue, I've finally succeed on what I wanted to do, and also can explain (from my understanding) why it wasn't working for me before debug : 'pd-error' Pattern 'ERROR' match on content line : celĮry.worker.job ERROR Task _and_send_telemetry_info fail]īut I'm not having any luck at the moment. What I want is $MONIT_DESCRIPTION content which is actually: What I want is pass the content that has been matched as an additional argument for dd_notify.py program īut what i get is (which is the result of executing dd_notify.py): If CONTENT = "ERROR" then exec "/usr/bin/python /opt/scripts/bin/dd_notify.py pd_error " $MONIT_DESCRIPTION ![]() All Rights Reserved.įollowing DevOps reccommendation, I've upgraded monit version.Īnd tried to use MONIT_DESCRIPTION or $MONIT_DESCRIPTION without success rule file is like this:Ĭheck file pd-error with path /var/log/testmonit.log Sorry I forgot to say that I'm running this on: Ubuntu 16.04 LTS andīuilt with ssl, with ipv6, with compression, with pam and with large filesĬopyright (C) 2001-2017 Tildeslash Ltd. Please if you think this has been addressed before feel free to point me to right direction. I'd a look into google several times (and also here) but I can't find the answer. (I've tried $DESCRIPTION, $HOST.etc but this seems to work only for email) There's any monit variable(like $DESCRIPTION for mail) that refer to MATCH line that trigger the rule? ![]() To send this string into my monitoring system (DataDog) I can't find any documentation that actually allows me to use the MATCH content, or groups (which I can see is being supported by the MATCH regex) info : 'testmonit' exec: /usr/bin/python On monit logs I can see: error : 'testmonit' content match This config make what I want, it actually raise the alarm I wantedīut now I need to know "What caused this alarm" so for example if this line appears on the log: ERROR failure to complete process due lock file. Montirc file: check file testmonit with path /var/log/testmonit.logĮxec "/usr/bin/python /opt/scripts/bin/dd_notify.py test-error" Using a very simple rule I'm able to catch line on log that provoke this error and run the specific script to alert until here all ok: I'm using monit to scan logs for errors and then push those alert into a monitoring system called DataDog.Īll seem to work as expected but now I've need to grab what is causing the alarm. So if the program run_constantly.py crashes, the following program still thinks that the run_constantly.py is running (since both process IDs are same), and therefore continues to go into the else loop to sleep and monitor again.I've asked this question around a week a go at StackOverflow but no answer just yet, probably is no possible but don't know where to look for that answer, I hope someone can help over here. However, the major problem that I am currently facing is that the process ID of this program and the run_constantly.py program turns to be same once I schedule the run_constantly.py using the scheduler.add_cron_job() function. In the above program, I check if res = 0, and if so, then I use Python's scheduler to schedule the program. if the program is still running) and if it does not exist, it returns 0. checkPID() basically checks if the process ID still exists (i.e. I have not included the checkPID() function here. #the process is running sleep and then monitor again Scheduler.add_cron_job(foo.Run_Module, year=date_time.year, day=date_time.day, month=date_time.month, hour=date_time.hour, minute=date_time.minute+2) # if res is 0 then program is not running so schedule it ![]() # call the function checkPID to see if the program is running or not Now I run another program which has the following code to monitor the program run_constantly.py from a Linux environment: def Monitor_Periodic_Process():įoo = imp.load_source("Run_Module","run_constantly.py") I initially run this program manually, which writes its process ID to the file "PID" (in the location out/PROCESSID/PID). I am using another Python program to do so.įor example, say I have to constantly run a process called run_constantly.py. If the program stops, then I have to start the program again. I am trying to constantly monitor a process which is basically a Python program. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |