Files
vmis/backend/app/services/scheduler/schedule_system.py
VMIS Developer 42d1420f9c feat(backend): Phase 1-4 全新開發完成,37/37 TDD 通過
[Phase 0 Reset]
- 清除舊版 app/、alembic/versions/、雜亂測試腳本
- 新 requirements.txt (移除 caldav/redis/keycloak-lib,加入 apscheduler/croniter/docker/paramiko/ping3/dnspython)

[Phase 1 資料庫]
- 9 張資料表 SQLAlchemy Models:tenants / accounts / schedules / schedule_logs /
  tenant_schedule_results / account_schedule_results / servers / server_status_logs / system_status_logs
- Alembic migration 001_create_all_tables (已套用到 10.1.0.20:5433/virtual_mis)
- seed.py:schedules 初始 3 筆 / servers 初始 4 筆

[Phase 2 CRUD API]
- GET/POST/PUT/DELETE: /api/v1/tenants / accounts / servers / schedules
- /api/v1/system-status
- 帳號編碼自動產生 (prefix + seq_no 4碼左補0)
- 燈號 (lights) 從最新排程結果取得

[Phase 3 Watchdog]
- APScheduler interval 3分鐘,原子 UPDATE status=Going 防重複執行
- 手動觸發 API: POST /api/v1/schedules/{id}/run

[Phase 4 Service Clients]
- KeycloakClient:vmis-admin realm,REST API (不用 python-keycloak)
- MailClient:Docker Mailserver @ 10.1.0.254:8080,含 MX DNS 驗證
- DockerClient:docker-py 本機 + paramiko SSH 遠端 compose
- NextcloudClient:OCS API user/quota
- SystemChecker:功能驗證 (traefik routers>0 / keycloak token / SMTP EHLO / DB SELECT 1 / ping)

[TDD]
- 37 tests / 37 passed (2.11s)
- SQLite in-memory + StaticPool,無需外部 DB

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 13:10:15 +08:00

95 lines
3.6 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
Schedule 3 — 系統狀態(每日 08:00
Part A: 基礎設施服務功能驗證traefik/keycloak/mail/db
Part B: 伺服器 ping 檢查
"""
import logging
from datetime import datetime
from sqlalchemy.orm import Session
from app.models.server import SystemStatusLog, ServerStatusLog, Server
logger = logging.getLogger(__name__)
# Fixed 8 services: environment × service_name
SERVICES = [
{"environment": "test", "service_name": "traefik",
"service_desc": "測試環境反向代理", "host": "localhost", "port": 8080},
{"environment": "test", "service_name": "keycloak",
"service_desc": "測試環境 SSO",
"url": "https://auth.lab.taipei", "realm": "master"},
{"environment": "test", "service_name": "mail",
"service_desc": "測試環境 Mail Server", "host": "localhost", "port": 587},
{"environment": "test", "service_name": "db",
"service_desc": "10.1.0.20:5433 PostgreSQL",
"db_host": "10.1.0.20", "db_port": 5433},
{"environment": "prod", "service_name": "traefik",
"service_desc": "正式環境反向代理", "host": "localhost", "port": 8080},
{"environment": "prod", "service_name": "keycloak",
"service_desc": "正式環境 SSO",
"url": "https://auth.ease.taipei", "realm": "master"},
{"environment": "prod", "service_name": "mail",
"service_desc": "正式環境 Mail Server", "host": "10.1.0.254", "port": 587},
{"environment": "prod", "service_name": "db",
"service_desc": "10.1.0.254:5432 PostgreSQL",
"db_host": "10.1.0.254", "db_port": 5432},
]
def run_system_status(schedule_log_id: int, db: Session):
from app.services.system_checker import SystemChecker
checker = SystemChecker()
# Part A: Infrastructure services
for svc in SERVICES:
result = False
fail_reason = None
try:
if svc["service_name"] == "traefik":
result = checker.check_traefik(svc["host"], svc["port"])
elif svc["service_name"] == "keycloak":
result = checker.check_keycloak(svc["url"], svc["realm"])
elif svc["service_name"] == "mail":
result = checker.check_smtp(svc["host"], svc["port"])
elif svc["service_name"] == "db":
result = checker.check_postgres(svc["db_host"], svc["db_port"])
except Exception as e:
result = False
fail_reason = str(e)
db.add(SystemStatusLog(
schedule_log_id=schedule_log_id,
environment=svc["environment"],
service_name=svc["service_name"],
service_desc=svc["service_desc"],
result=result,
fail_reason=fail_reason,
recorded_at=datetime.utcnow(),
))
# Part B: Server ping
servers = db.query(Server).filter(Server.is_active == True).order_by(Server.sort_order).all()
for server in servers:
response_time = None
fail_reason = None
try:
response_time = checker.ping_server(server.ip_address)
result = response_time is not None
if not result:
fail_reason = "No response"
except Exception as e:
result = False
fail_reason = str(e)
db.add(ServerStatusLog(
schedule_log_id=schedule_log_id,
server_id=server.id,
result=result,
response_time=response_time,
fail_reason=fail_reason,
recorded_at=datetime.utcnow(),
))
db.commit()
logger.info(f"System status check done: {len(SERVICES)} services + {len(servers)} servers")