Documentation
¶
Index ¶
- func HTTPHealthCheck(healthCheckEndpointURL string, config *HTTPHealthCheckConfig) func(ctx context.Context) Report
- func StateMessage(s Status) string
- type Check
- type DependencyCheck
- type DetailCheck
- type HTTPHealthCheckConfig
- type Issue
- type IssueJSONDTO
- type Monitor
- type Report
- type ReportJSONDTO
- type Status
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HTTPHealthCheck ¶
func HTTPHealthCheck(healthCheckEndpointURL string, config *HTTPHealthCheckConfig) func(ctx context.Context) Report
Example ¶
package main import ( "go.llib.dev/frameless/pkg/devops/health" ) func main() { var m = health.Monitor{ Dependencies: []health.DependencyCheck{ health.HTTPHealthCheck("https://www.example.com/health", nil), }, } _ = m }
func StateMessage ¶
Types ¶
type Check ¶ added in v0.203.0
Check represents a health check. Check supposed to yield back nil if the check passes. Check should yield back an error in case the check detected a problem. For problems, Check may return back an Issue to describe in detail the problem. Most Errors will be considered as
type DependencyCheck ¶ added in v0.203.0
DependencyCheck serves as a health check for a specific dependency. If an error occurs during the check, it should be represented as an Issue in the returned Report.Issues list.
For example, if a remote service is unreachable on the network, it should be represented as an issue in the Report.Issues that the service is unreachable, and the Issue.Causes should tell that this makes the given dependency health Status considered as Down.
type DetailCheck ¶ added in v0.218.0
DetailCheck represents a metric reporting function. The result will be added to the Report.Metrics. A DetailCheck results encompass analytical purpose, a status indicators for the service for the given time when the service were called. If numerical values are included, they should fluctuate over time, reflecting the current state.
Values that behave differently depending on how long the application runs are not ideal. For instance, a good metric value indicates the current throughput of the HTTP API,
A challenging metric value would be a counter that counts the total handled requests number from a given application's instance lifetime.
type HTTPHealthCheckConfig ¶
type Issue ¶
type Issue struct { // Code is meant for programmatic processing of an issue detection. // Should contain no whitespace and use dash-case/snakecase/const-case. Code consttypes.String // Message can contain further details about the detected issue. Message string // Causes will indicate the status change this Issue will cause Causes Status }
Issue represents an issue detected in during a health check.
type IssueJSONDTO ¶ added in v0.202.1
type Monitor ¶
type Monitor struct { // ServiceName will be used to set the Report.Name field. ServiceName string // Checks contain the health checks about our own service. // Check should return with nil in case the check passed. // Check should return back with an Issue or a generic error, in case the check failed. // Returned generic errors are considered as an Issue with Down Status. Checks []Check // Dependencies represent our service's dependencies and their health state (Report). // DependencyCheck should come back always with a valid Report. Dependencies []DependencyCheck // Details represents our service's monitoring metrics. Details map[string]DetailCheck }
Example (Check) ¶
package main import ( "context" "sync" "go.llib.dev/frameless/pkg/devops/health" ) func main() { const detailKeyForHTTPRetryPerSec = "http-retry-average-per-second" appDetails := sync.Map{} var hm = health.Monitor{ Checks: []health.Check{ func(ctx context.Context) error { value, ok := appDetails.Load(detailKeyForHTTPRetryPerSec) if !ok { return nil } averagePerSec, ok := value.(int) if !ok { return nil } if 42 < averagePerSec { return health.Issue{ Causes: health.Degraded, Code: "too-many-http-request-retries", Message: "There could be an underlying networking issue, " + "that needs to be looked into, the system is working, " + "but the retry attemt average shouldn't be so high", } } return nil }, }, } ctx := context.Background() hs := hm.HealthCheck(ctx) _ = hs // use the results }
Example (Dependency) ¶
package main import ( "context" "database/sql" "go.llib.dev/frameless/pkg/devops/health" ) func main() { var hm health.Monitor var db *sql.DB // populate it with a live db connection hm.Dependencies = append(hm.Dependencies, func(ctx context.Context) health.Report { var hs health.Report err := db.PingContext(ctx) if err != nil { hs.Issues = append(hs.Issues, health.Issue{ Causes: health.Down, Code: "xy-db-disconnected", Message: "failed to ping the database through the connection", }) } // additional health checks on the DB dependency return hs }) ctx := context.Background() hs := hm.HealthCheck(ctx) _ = hs // use the results }
func (*Monitor) HTTPHandler ¶
Example ¶
package main import ( "context" "net/http" "go.llib.dev/frameless/pkg/devops/health" ) func main() { var m = health.Monitor{ Checks: []health.Check{ func(ctx context.Context) error { return nil // all good }, }, } mux := http.NewServeMux() mux.Handle("/health", m.HTTPHandler()) _ = http.ListenAndServe("0.0.0.0:8080", mux) }
type Report ¶
type Report struct { // Name field typically contains a descriptive name for the service or application. Name string // Status is the current health status of a given service. // // By default, an empty Status interpreted as Up Status. // If an Issue in Issues causes Status change, then it will be reflected in the Report.Status as well. // If a dependency has a non Up Status, then the current Status considered as PartialOutage. Status Status // Message field provides an explanation of the current state or specific issues (if any) affecting the service. // Message is optional, and when it's empty, the default is inferred from the Report.Status value. Message string // Issues is the list of issue that the health check functions were able to detect. // If an Issue in Report.Issues contain a Issue.Causes, then the Report.Status will be affected. Issues []Issue // Dependencies are the service dependencies, which are required for the service to function correctly. // If a Report has a problemating Status in Report.Dependencies, it will affect the Report.Status. Dependencies []Report // Timestamp represents the time at the health check report was created // Default is the current time in UTC. Timestamp time.Time // Details encompass analytical data and status indicators // for the service for the given time when the service were called. // For more about what values it should contain, read the documentation of Metric. Details map[string]any }
type ReportJSONDTO ¶ added in v0.202.1
type ReportJSONDTO struct { Status string `json:"status"` Name string `json:"name,omitempty"` Message string `json:"message,omitempty"` Issues []IssueJSONDTO `json:"issues,omitempty"` Dependencies []ReportJSONDTO `json:"dependencies,omitempty"` Timestamp string `json:"timestamp,omitempty"` Details map[string]any `json:"details,omitempty"` }
type Status ¶
type Status string
const ( // Up means that service is running correctly and able to respond to requests. Up Status = "UP" // Down means that service is not running or unresponsive. Down Status = "DOWN" // PartialOutage means that service is running, but one or more dependencies are experiencing issues. // PartialOutage also indicates that there has been a limited disruption or degradation in the service. // It typically affects only a subset of services or users, rather than the entire system. // Examples of partial outages include slower response times, intermittent errors, // or reduced functionality for specific features. PartialOutage Status = "PARTIAL_OUTAGE" // Degraded means that service is running but with reduced capabilities or performance. // When a system is in a Degraded state, it means that overall performance or functionality has deteriorated. // Unlike a PartialOutage, a Degraded state may impact a broader scope of services or users. // It could result in slower overall system performance, increased error rates, or reduced capacity. // Monitoring tools often detect this state based on predefined thresholds or deviations from expected behaviour. Degraded Status = "DEGRADED" // Maintenance means that service is currently undergoing maintenance or updates and might not function correctly. Maintenance Status = "MAINTENANCE" // Unknown means that service's status cannot be determined due to an error or lack of information. Unknown Status = "UNKNOWN" )