Compare commits

...

71 Commits

Author SHA1 Message Date
Stepan Vladovskiy
e1d1096674 feat: without staging deploying by gitea
All checks were successful
Deploy on push / deploy (push) Successful in 6s
2025-06-02 18:17:24 -03:00
Stepan Vladovskiy
82870a4e47 debug: prechase wrapped for time out
All checks were successful
Deploy on push / deploy (push) Successful in 1m15s
2025-05-20 11:26:30 -03:00
Stepan Vladovskiy
80b909d801 debug: with logs in prechashing process
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-05-20 11:23:00 -03:00
Stepan Vladovskiy
1ada0a02f9 debug: with timeout for prechashing 2025-05-20 11:19:58 -03:00
Stepan Vladovskiy
44aef147b5 debug: moved precache to background to avoid stucking ...
All checks were successful
Deploy on push / deploy (push) Successful in 45s
2025-05-20 11:03:02 -03:00
Stepan Vladovskiy
2bebfbd4df debug: force rebuild core stag branch
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-05-19 15:45:13 -03:00
Stepan Vladovskiy
f19248184a debug: without ersions for starlette and ariadne
All checks were successful
Deploy on push / deploy (push) Successful in 1m5s
2025-05-18 22:48:34 +00:00
Stepan Vladovskiy
7df9361daa debug: Dockerfile with build-essential
Some checks failed
Deploy on push / deploy (push) Failing after 1m36s
2025-05-18 22:43:20 +00:00
Stepan Vladovskiy
e38a1c1338 with vers for starlette, ariadne, granian
Some checks failed
Deploy on push / deploy (push) Failing after 50s
2025-05-18 22:36:47 +00:00
Stepan Vladovskiy
1281157d93 feat: check before parse graphQL
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-05-14 14:42:40 -03:00
Stepan Vladovskiy
0018749905 Merge branch 'dev' into staging
All checks were successful
Deploy on push / deploy (push) Successful in 1m28s
2025-05-14 14:33:52 -03:00
Stepan Vladovskiy
c344fcee2d refactoring(search.py): logs for search-combine and search-authors are equal
All checks were successful
Deploy on push / deploy (push) Successful in 6s
2025-05-02 18:28:06 -03:00
Stepan Vladovskiy
a1a61a6731 feat: follow same logic as search shouts for authors. Store them to Reddis cache + pagination
All checks were successful
Deploy on push / deploy (push) Successful in 41s
2025-05-02 18:17:05 -03:00
Stepan Vladovskiy
8d6ad2c84f refactor(author.py): remove verbose loging in resolver level 2025-05-02 18:04:10 -03:00
Stepan Vladovskiy
beba1992e9 fix(__init.py__): clean name of resolver for authors search loading
All checks were successful
Deploy on push / deploy (push) Successful in 39s
2025-04-29 19:49:47 -03:00
Stepan Vladovskiy
b0296d7747 fix(__init.py__): added created resolver in resolvers lists
All checks were successful
Deploy on push / deploy (push) Successful in 41s
2025-04-29 19:40:20 -03:00
Stepan Vladovskiy
98e3dff35e fix(author.py): resolver load_authors_search error fix
All checks were successful
Deploy on push / deploy (push) Successful in 40s
2025-04-29 18:00:38 -03:00
Stepan Vladovskiy
3782a9dffb fix(search.py, author.py): small fixes for start. logger import fails
All checks were successful
Deploy on push / deploy (push) Successful in 40s
2025-04-29 17:50:51 -03:00
Stepan Vladovskiy
93c00b3dd1 feat(author.py):addresolver for searching authors by text
All checks were successful
Deploy on push / deploy (push) Successful in 1m15s
2025-04-29 17:45:37 -03:00
Stepan Vladovskiy
fac43e5997 refact(search,reader): withput any kind of sorting
All checks were successful
Deploy on push / deploy (push) Successful in 42s
2025-04-24 21:00:41 -03:00
Stepan Vladovskiy
e7facf8d87 style(search.py): with indexing message
All checks were successful
Deploy on push / deploy (push) Successful in 42s
2025-04-24 18:45:00 -03:00
Stepan Vladovskiy
3062a2b7de refactor(search.py): with checking titles without bodies for not re indexing them every startup
All checks were successful
Deploy on push / deploy (push) Successful in 42s
2025-04-24 14:58:14 -03:00
Stepan Vladovskiy
c0406dbbf2 refac(search.py): without logger and rm dublicated def search-text
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-04-24 14:18:14 -03:00
Stepan Vladovskiy
ab4610575f refactor(reader.py): to handle search combined
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-04-24 13:56:38 -03:00
Stepan Vladovskiy
5425dbf832 refactor(search.py): simplify def search 2025-04-24 13:46:58 -03:00
Stepan Vladovskiy
a10db2d38a feat(search.py): combined search on shouts tittles and bodys 2025-04-24 13:35:36 -03:00
Stepan Vladovskiy
83e70856cd debug(server.py): i dont know why, but it is appears and i am rm it
All checks were successful
Deploy on push / deploy (push) Successful in 41s
2025-04-23 18:32:58 -03:00
Stepan Vladovskiy
11654dba68 feat: with three separate endpoints
All checks were successful
Deploy on push / deploy (push) Successful in 5s
2025-04-23 18:24:00 -03:00
Stepan Vladovskiy
ec9465ad40 merge dev
All checks were successful
Deploy on push / deploy (push) Successful in 46s
2025-04-20 19:24:59 -03:00
Stepan Vladovskiy
4d965fb27b feat(search.py): separate indexing of Shout Title, shout Body and Authors
All checks were successful
Deploy on push / deploy (push) Successful in 39s
2025-04-20 19:22:08 -03:00
Stepan Vladovskiy
e382cc1ea5 Merge branch 'dev' into feat/sv-searching-txtai
All checks were successful
Deploy on push / deploy (push) Successful in 6s
:
2025-04-15 19:20:48 -03:00
to
83d61ca76d Merge branch 'dev' into feat/sv-searching-txtai
All checks were successful
Deploy on push / deploy (push) Successful in 6s
2025-04-13 05:36:18 +00:00
Stepan Vladovskiy
106222b0e0 debug: without debug logging. clean
All checks were successful
Deploy on push / deploy (push) Successful in 1m27s
2025-04-07 11:41:48 -03:00
Stepan Vladovskiy
c533241d1e fix(reader): sorting by rang not by id in cash
All checks were successful
Deploy on push / deploy (push) Successful in 6s
2025-04-03 13:51:13 -03:00
Stepan Vladovskiy
78326047bf fix(reader.py): change sorting and answer on querys
All checks were successful
Deploy on push / deploy (push) Successful in 50s
2025-04-03 13:20:18 -03:00
Stepan Vladovskiy
bc4ec79240 fix(search.py): store all results in cash not only first offset
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-04-03 13:10:53 -03:00
Stepan Vladovskiy
a0db5707c4 feat: add cash for storing searchresalts and hold them for working pagination. Now we are have offset for use on frontend
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-04-01 16:01:09 -03:00
Stepan Vladovskiy
ecc443c3ad refactor(reader.py): Remove the unnecessary topic joins that cause duplicate results
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-04-01 12:57:46 -03:00
Stepan Vladovskiy
9a02ca74ad merged with dev
All checks were successful
Deploy on push / deploy (push) Successful in 1m24s
2025-03-31 13:38:32 -03:00
Stepan Vladovskiy
9ebb81cbd3 refactor(reader.py): rm debug line 2025-03-31 13:32:51 -03:00
Stepan Vladovskiy
0bc55977ac debug(reader.py): query_with_stat(info) always
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-03-27 15:18:08 -03:00
Stepan Vladovskiy
ff3a4debce debug(reader.py): trying to handle main topic ids founded
All checks were successful
Deploy on push / deploy (push) Successful in 54s
2025-03-27 14:43:17 -03:00
Stepan Vladovskiy
ae85b32f69 feat(type.qraphql): SearchResult with shout id
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-03-27 14:06:52 -03:00
Stepan Vladovskiy
34a354e9e3 debug(reader.py: trying back shout id in query call
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-03-27 11:54:56 -03:00
Stepan Vladovskiy
e405fb527b refactor(search.py): moved to use one table docs for embdings and docs store
All checks were successful
Deploy on push / deploy (push) Successful in 50s
2025-03-25 16:42:44 -03:00
Stepan Vladovskiy
7f36f93d92 feat(search.py): detects both missing documents and null embeddings
All checks were successful
Deploy on push / deploy (push) Successful in 1m32s
2025-03-25 15:18:29 -03:00
Stepan Vladovskiy
f089a32394 debug(search.py): with more logs when check sync of indexing
All checks were successful
Deploy on push / deploy (push) Successful in 1m3s
2025-03-25 14:44:05 -03:00
Stepan Vladovskiy
1fd623a660 feat: with index sync endpoints configs
All checks were successful
Deploy on push / deploy (push) Successful in 56s
2025-03-25 13:31:45 -03:00
Stepan Vladovskiy
88012f1b8c debug(server.py): with 4 workers (threds). cheking reindexing
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-25 12:21:59 -03:00
Stepan Vladovskiy
6e284640c0 feat: give little timeout for resource stab
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-03-24 21:42:51 -03:00
Stepan Vladovskiy
077cb46482 debug: server.py -> threds 1 , search.py -> add 3 times reconect
All checks were successful
Deploy on push / deploy (push) Successful in 49s
2025-03-24 20:16:07 -03:00
Stepan Vladovskiy
60a13a9097 refactor(search.py): moved initialization logic in search-txtai instance
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-24 19:47:02 -03:00
Stepan Vladovskiy
316375bf18 debug(search.py): encrease batch size for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 1m1s
2025-03-21 17:56:54 -03:00
Stepan Vladovskiy
fb820f67fd debug(search.py): encrease batch size for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 17:48:26 -03:00
Stepan Vladovskiy
f1d9f4e036 feat(search.py): with db reset endpoint
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 17:28:54 -03:00
Stepan Vladovskiy
ebb67eb311 debug: decrease chars in search.py for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-03-21 16:53:00 -03:00
Stepan Vladovskiy
50a8c24ead feat(search.py): documnet for bulk indexing are categorized
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-21 15:40:29 -03:00
Stepan Vladovskiy
eb4b9363ab debug: change logs entris and indexing not wraps all in documents
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 14:32:45 -03:00
Stepan Vladovskiy
19c5028a0c debug: Limit max chars for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 14:18:32 -03:00
Stepan Vladovskiy
57e1e8e6bd debug: more logs in indexing
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 14:10:09 -03:00
Stepan Vladovskiy
385057ffcd debug: with logs in indexing procedure
All checks were successful
Deploy on push / deploy (push) Successful in 54s
2025-03-21 13:45:50 -03:00
Stepan Vladovskiy
90699768ff debug: start index
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-21 13:30:23 -03:00
Stepan Vladovskiy
ad0ca75aa9 debug: no redis for indexing in nackend side
All checks were successful
Deploy on push / deploy (push) Successful in 1m41s
2025-03-19 14:47:31 -03:00
Stepan Vladovskiy
39242d5e6c debug: add logs in search.py and change and input validation ... index ver too
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-12 14:13:55 -03:00
Stepan Vladovskiy
24cca7f2cb debug: something wrong one stap back with logs
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-12 13:11:19 -03:00
Stepan Vladovskiy
a9c7ac49d6 feat: with logs >>>
All checks were successful
Deploy on push / deploy (push) Successful in 59s
2025-03-12 13:07:27 -03:00
Stepan Vladovskiy
f249752db5 feat: moved txtai and search procedure in different instance
All checks were successful
Deploy on push / deploy (push) Successful in 2m18s
2025-03-12 12:06:09 -03:00
Stepan Vladovskiy
c0b2116da2 feat(db.py): added fetch_all_shouts, to populate the search index
All checks were successful
Deploy on push / deploy (push) Successful in 35s
2025-03-05 20:32:34 +00:00
Stepan Vladovskiy
59e71c8144 debug: fixed workflows gitea
All checks were successful
Deploy on push / deploy (push) Successful in 4m41s
2025-03-05 20:17:34 +00:00
Stepan Vladovskiy
e6a416383d debug: fixed workflows gitea
All checks were successful
Deploy on push / deploy (push) Successful in 15s
2025-03-05 20:16:32 +00:00
Stepan Vladovskiy
d55448398d feat(search.py): change to txtai server, with ai model. And fix granian workers 2025-03-05 20:08:21 +00:00
15 changed files with 1146 additions and 224 deletions

View File

@ -29,7 +29,16 @@ jobs:
if: github.ref == 'refs/heads/dev'
uses: dokku/github-action@master
with:
branch: 'dev'
branch: 'main'
force: true
git_remote_url: 'ssh://dokku@v2.discours.io:22/core'
ssh_private_key: ${{ secrets.SSH_PRIVATE_KEY }}
- name: Push to dokku for staging branch
if: github.ref == 'refs/heads/staging'
uses: dokku/github-action@master
with:
branch: 'dev'
git_remote_url: 'ssh://dokku@staging.discours.io:22/core'
ssh_private_key: ${{ secrets.SSH_PRIVATE_KEY }}
git_push_flags: '--force'

6
.gitignore vendored
View File

@ -128,6 +128,9 @@ dmypy.json
.idea
temp.*
# Debug
DEBUG.log
discours.key
discours.crt
discours.pem
@ -161,4 +164,5 @@ views.json
*.key
*.crt
*cache.json
.cursor
.cursor
.devcontainer/

View File

@ -3,6 +3,7 @@ FROM python:slim
RUN apt-get update && apt-get install -y \
postgresql-client \
curl \
build-essential \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app

12
cache/precache.py vendored
View File

@ -77,11 +77,15 @@ async def precache_topics_followers(topic_id: int, session):
async def precache_data():
logger.info("precaching...")
logger.debug("Entering precache_data")
try:
key = "authorizer_env"
logger.debug(f"Fetching existing hash for key '{key}' from Redis")
# cache reset
value = await redis.execute("HGETALL", key)
logger.debug(f"Fetched value for '{key}': {value}")
await redis.execute("FLUSHDB")
logger.debug("Redis database flushed")
logger.info("redis: FLUSHDB")
# Преобразуем словарь в список аргументов для HSET
@ -97,21 +101,27 @@ async def precache_data():
await redis.execute("HSET", key, *value)
logger.info(f"redis hash '{key}' was restored")
logger.info("Beginning topic precache phase")
with local_session() as session:
# topics
q = select(Topic).where(Topic.community == 1)
topics = get_with_stat(q)
logger.info(f"Found {len(topics)} topics to precache")
for topic in topics:
topic_dict = topic.dict() if hasattr(topic, "dict") else topic
logger.debug(f"Precaching topic id={topic_dict.get('id')}")
await cache_topic(topic_dict)
logger.debug(f"Cached topic id={topic_dict.get('id')}")
await asyncio.gather(
precache_topics_followers(topic_dict["id"], session),
precache_topics_authors(topic_dict["id"], session),
)
logger.debug(f"Finished precaching followers and authors for topic id={topic_dict.get('id')}")
logger.info(f"{len(topics)} topics and their followings precached")
# authors
authors = get_with_stat(select(Author).where(Author.user.is_not(None)))
logger.info(f"Found {len(authors)} authors to precache")
logger.info(f"{len(authors)} authors found in database")
for author in authors:
if isinstance(author, Author):
@ -119,10 +129,12 @@ async def precache_data():
author_id = profile.get("id")
user_id = profile.get("user", "").strip()
if author_id and user_id:
logger.debug(f"Precaching author id={author_id}")
await cache_author(profile)
await asyncio.gather(
precache_authors_followers(author_id, session), precache_authors_follows(author_id, session)
)
logger.debug(f"Finished precaching followers and follows for author id={author_id}")
else:
logger.error(f"fail caching {author}")
logger.info(f"{len(authors)} authors and their followings precached")

61
main.py
View File

@ -17,7 +17,7 @@ from cache.revalidator import revalidation_manager
from services.exception import ExceptionHandlerMiddleware
from services.redis import redis
from services.schema import create_all_tables, resolvers
from services.search import search_service
from services.search import search_service, initialize_search_index
from services.viewed import ViewedStorage
from services.webhook import WebhookEndpoint, create_webhook_endpoint
from settings import DEV_SERVER_PID_FILE_NAME, MODE
@ -34,24 +34,79 @@ async def start():
f.write(str(os.getpid()))
print(f"[main] process started in {MODE} mode")
async def check_search_service():
"""Check if search service is available and log result"""
info = await search_service.info()
if info.get("status") in ["error", "unavailable"]:
print(f"[WARNING] Search service unavailable: {info.get('message', 'unknown reason')}")
else:
print(f"[INFO] Search service is available: {info}")
# Helper to run precache with timeout and catch errors
async def precache_with_timeout():
try:
await asyncio.wait_for(precache_data(), timeout=60)
except asyncio.TimeoutError:
print("[precache] Precache timed out after 60 seconds")
except Exception as e:
print(f"[precache] Error during precache: {e}")
# indexing DB data
# async def indexing():
# from services.db import fetch_all_shouts
# all_shouts = await fetch_all_shouts()
# await initialize_search_index(all_shouts)
async def lifespan(_app):
try:
print("[lifespan] Starting application initialization")
create_all_tables()
# schedule precaching in background with timeout and error handling
asyncio.create_task(precache_with_timeout())
await asyncio.gather(
redis.connect(),
precache_data(),
ViewedStorage.init(),
create_webhook_endpoint(),
search_service.info(),
check_search_service(),
start(),
revalidation_manager.start(),
)
print("[lifespan] Basic initialization complete")
# Add a delay before starting the intensive search indexing
print("[lifespan] Waiting for system stabilization before search indexing...")
await asyncio.sleep(10) # 10-second delay to let the system stabilize
# Start search indexing as a background task with lower priority
asyncio.create_task(initialize_search_index_background())
yield
finally:
print("[lifespan] Shutting down application services")
tasks = [redis.disconnect(), ViewedStorage.stop(), revalidation_manager.stop()]
await asyncio.gather(*tasks, return_exceptions=True)
print("[lifespan] Shutdown complete")
# Initialize search index in the background
async def initialize_search_index_background():
"""Run search indexing as a background task with low priority"""
try:
print("[search] Starting background search indexing process")
from services.db import fetch_all_shouts
# Get total count first (optional)
all_shouts = await fetch_all_shouts()
total_count = len(all_shouts) if all_shouts else 0
print(f"[search] Fetched {total_count} shouts for background indexing")
# Start the indexing process with the fetched shouts
print("[search] Beginning background search index initialization...")
await initialize_search_index(all_shouts)
print("[search] Background search index initialization complete")
except Exception as e:
print(f"[search] Error in background search indexing: {str(e)}")
# Создаем экземпляр GraphQL
graphql_app = GraphQL(schema, debug=True)

View File

@ -71,6 +71,34 @@ class ShoutAuthor(Base):
class Shout(Base):
"""
Публикация в системе.
Attributes:
body (str)
slug (str)
cover (str) : "Cover image url"
cover_caption (str) : "Cover image alt caption"
lead (str)
title (str)
subtitle (str)
layout (str)
media (dict)
authors (list[Author])
topics (list[Topic])
reactions (list[Reaction])
lang (str)
version_of (int)
oid (str)
seo (str) : JSON
draft (int)
created_at (int)
updated_at (int)
published_at (int)
featured_at (int)
deleted_at (int)
created_by (int)
updated_by (int)
deleted_by (int)
community (int)
"""
__tablename__ = "shout"

View File

@ -13,6 +13,10 @@ starlette
gql
ariadne
granian
# NLP and search
httpx
orjson
pydantic
trafilatura

View File

@ -8,6 +8,7 @@ from resolvers.author import ( # search_authors,
get_author_id,
get_authors_all,
load_authors_by,
load_authors_search,
update_author,
)
from resolvers.community import get_communities_all, get_community
@ -73,6 +74,7 @@ __all__ = [
"get_author_follows_authors",
"get_authors_all",
"load_authors_by",
"load_authors_search",
"update_author",
## "search_authors",
# community

View File

@ -20,6 +20,7 @@ from services.auth import login_required
from services.db import local_session
from services.redis import redis
from services.schema import mutation, query
from services.search import search_service
from utils.logger import root_logger as logger
DEFAULT_COMMUNITIES = [1]
@ -301,6 +302,46 @@ async def load_authors_by(_, _info, by, limit, offset):
return await get_authors_with_stats(limit, offset, by)
@query.field("load_authors_search")
async def load_authors_search(_, info, text: str, limit: int = 10, offset: int = 0):
"""
Resolver for searching authors by text. Works with txt-ai search endpony.
Args:
text: Search text
limit: Maximum number of authors to return
offset: Offset for pagination
Returns:
list: List of authors matching the search criteria
"""
# Get author IDs from search engine (already sorted by relevance)
search_results = await search_service.search_authors(text, limit, offset)
if not search_results:
return []
author_ids = [result.get("id") for result in search_results if result.get("id")]
if not author_ids:
return []
# Fetch full author objects from DB
with local_session() as session:
# Simple query to get authors by IDs - no need for stats here
authors_query = select(Author).filter(Author.id.in_(author_ids))
db_authors = session.execute(authors_query).scalars().all()
if not db_authors:
return []
# Create a dictionary for quick lookup
authors_dict = {str(author.id): author for author in db_authors}
# Keep the order from search results (maintains the relevance sorting)
ordered_authors = [authors_dict[author_id] for author_id in author_ids if author_id in authors_dict]
return ordered_authors
def get_author_id_from(slug="", user=None, author_id=None):
try:
author_id = None

View File

@ -10,7 +10,7 @@ from orm.shout import Shout, ShoutAuthor, ShoutTopic
from orm.topic import Topic
from services.db import json_array_builder, json_builder, local_session
from services.schema import query
from services.search import search_text
from services.search import search_text, get_search_count
from services.viewed import ViewedStorage
from utils.logger import root_logger as logger
@ -187,12 +187,10 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
"""
shouts = []
try:
# logger.info(f"Starting get_shouts_with_links with limit={limit}, offset={offset}")
q = q.limit(limit).offset(offset)
with local_session() as session:
shouts_result = session.execute(q).all()
# logger.info(f"Got {len(shouts_result) if shouts_result else 0} shouts from query")
if not shouts_result:
logger.warning("No shouts found in query result")
@ -203,7 +201,6 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
shout = None
if hasattr(row, "Shout"):
shout = row.Shout
# logger.debug(f"Processing shout#{shout.id} at index {idx}")
if shout:
shout_id = int(f"{shout.id}")
shout_dict = shout.dict()
@ -231,20 +228,16 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
topics = None
if has_field(info, "topics") and hasattr(row, "topics"):
topics = orjson.loads(row.topics) if isinstance(row.topics, str) else row.topics
# logger.debug(f"Shout#{shout_id} topics: {topics}")
shout_dict["topics"] = topics
if has_field(info, "main_topic"):
main_topic = None
if hasattr(row, "main_topic"):
# logger.debug(f"Raw main_topic for shout#{shout_id}: {row.main_topic}")
main_topic = (
orjson.loads(row.main_topic) if isinstance(row.main_topic, str) else row.main_topic
)
# logger.debug(f"Parsed main_topic for shout#{shout_id}: {main_topic}")
if not main_topic and topics and len(topics) > 0:
# logger.info(f"No main_topic found for shout#{shout_id}, using first topic from list")
main_topic = {
"id": topics[0]["id"],
"title": topics[0]["title"],
@ -252,10 +245,8 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
"is_main": True,
}
elif not main_topic:
logger.warning(f"No main_topic and no topics found for shout#{shout_id}")
main_topic = {"id": 0, "title": "no topic", "slug": "notopic", "is_main": True}
shout_dict["main_topic"] = main_topic
# logger.debug(f"Final main_topic for shout#{shout_id}: {main_topic}")
if has_field(info, "authors") and hasattr(row, "authors"):
shout_dict["authors"] = (
@ -282,7 +273,6 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
logger.error(f"Fatal error in get_shouts_with_links: {e}", exc_info=True)
raise
finally:
logger.info(f"Returning {len(shouts)} shouts from get_shouts_with_links")
return shouts
@ -401,33 +391,49 @@ async def load_shouts_search(_, info, text, options):
"""
limit = options.get("limit", 10)
offset = options.get("offset", 0)
if isinstance(text, str) and len(text) > 2:
# Get search results with pagination
results = await search_text(text, limit, offset)
scores = {}
hits_ids = []
for sr in results:
shout_id = sr.get("id")
if shout_id:
shout_id = str(shout_id)
scores[shout_id] = sr.get("score")
hits_ids.append(shout_id)
if not results:
logger.info(f"No search results found for '{text}'")
return []
# Extract IDs in the order from the search engine
hits_ids = [str(sr.get("id")) for sr in results if sr.get("id")]
q = (
query_with_stat(info)
if has_field(info, "stat")
else select(Shout).filter(and_(Shout.published_at.is_not(None), Shout.deleted_at.is_(None)))
)
# Query DB for only the IDs in the current page
q = query_with_stat(info)
q = q.filter(Shout.id.in_(hits_ids))
q = apply_filters(q, options)
q = apply_sorting(q, options)
shouts = get_shouts_with_links(info, q, limit, offset)
for shout in shouts:
shout.score = scores[f"{shout.id}"]
shouts.sort(key=lambda x: x.score, reverse=True)
return shouts
q = apply_filters(q, options.get("filters", {}))
shouts = get_shouts_with_links(info, q, len(hits_ids), 0)
# Reorder shouts to match the order from hits_ids
shouts_dict = {str(shout['id']): shout for shout in shouts}
ordered_shouts = [shouts_dict[shout_id] for shout_id in hits_ids if shout_id in shouts_dict]
return ordered_shouts
return []
@query.field("get_search_results_count")
async def get_search_results_count(_, info, text):
"""
Returns the total count of search results for a search query.
:param _: Root query object (unused)
:param info: GraphQL context information
:param text: Search query text
:return: Total count of results
"""
if isinstance(text, str) and len(text) > 2:
count = await get_search_count(text)
return {"count": count}
return {"count": 0}
@query.field("load_shouts_unrated")
async def load_shouts_unrated(_, info, options):
"""

View File

@ -4,7 +4,7 @@ type Query {
get_author_id(user: String!): Author
get_authors_all: [Author]
load_authors_by(by: AuthorsBy!, limit: Int, offset: Int): [Author]
# search_authors(what: String!): [Author]
load_authors_search(text: String!, limit: Int, offset: Int): [Author!] # Search for authors by name or bio
# community
get_community: Community
@ -33,6 +33,7 @@ type Query {
get_shout(slug: String, shout_id: Int): Shout
load_shouts_by(options: LoadShoutsOptions): [Shout]
load_shouts_search(text: String!, options: LoadShoutsOptions): [SearchResult]
get_search_results_count(text: String!): CountResult!
load_shouts_bookmarked(options: LoadShoutsOptions): [Shout]
# rating

View File

@ -213,6 +213,7 @@ type CommonResult {
}
type SearchResult {
id: Int!
slug: String!
title: String!
cover: String
@ -280,3 +281,7 @@ type MyRateComment {
my_rate: ReactionKind
}
type CountResult {
count: Int!
}

View File

@ -19,7 +19,7 @@ from sqlalchemy import (
inspect,
text,
)
from sqlalchemy.orm import Session, configure_mappers, declarative_base
from sqlalchemy.orm import Session, configure_mappers, declarative_base, joinedload
from sqlalchemy.sql.schema import Table
from settings import DB_URL
@ -259,3 +259,32 @@ def get_json_builder():
# Используем их в коде
json_builder, json_array_builder, json_cast = get_json_builder()
# Fetch all shouts, with authors preloaded
# This function is used for search indexing
async def fetch_all_shouts(session=None):
"""Fetch all published shouts for search indexing with authors preloaded"""
from orm.shout import Shout
close_session = False
if session is None:
session = local_session()
close_session = True
try:
# Fetch only published and non-deleted shouts with authors preloaded
query = session.query(Shout).options(
joinedload(Shout.authors)
).filter(
Shout.published_at.is_not(None),
Shout.deleted_at.is_(None)
)
shouts = query.all()
return shouts
except Exception as e:
logger.error(f"Error fetching shouts for search indexing: {e}")
return []
finally:
if close_session:
session.close()

View File

@ -29,12 +29,19 @@ async def request_graphql_data(gql, url=AUTH_URL, headers=None):
async with httpx.AsyncClient() as client:
response = await client.post(url, json=gql, headers=headers)
if response.status_code == 200:
data = response.json()
errors = data.get("errors")
if errors:
logger.error(f"{url} response: {data}")
# Check if the response has content before parsing
if response.content and len(response.content.strip()) > 0:
try:
data = response.json()
errors = data.get("errors")
if errors:
logger.error(f"{url} response: {data}")
else:
return data
except Exception as json_err:
logger.error(f"JSON decode error: {json_err}, Response content: {response.text[:100]}")
else:
return data
logger.error(f"{url}: Response is empty")
else:
logger.error(f"{url}: {response.status_code} {response.text}")
except Exception as _e:

File diff suppressed because it is too large Load Diff